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(57) Abstract 

Compounds and methods for diagnosing tuberculosis are disclosed. The compounds provided include polypeptides that contain at 
least one antigenic portion of one or more M. tuberculosis proteins, and DNA sequences encoding such polypeptides. Diagnostic kits 
containing such polypeptides or DNA sequences and a suitable detection reagent may be used for the detection of M. tuberculosis infection 
in patients and biological samples. Antibodies directed against such polypeptides are also provided. 
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COMPOUNDS AND METHODS FOR DIAGNOSIS OF TUBERCULOSIS 



TECHNICAL FIELD 

The present invention relates generally to the detection of Mycobacterium 
5 tuberculosis infection. The invention is more particularly related to polypeptides comprising 
a Mycobacterium tuberculosis antigen, or a portion or other variant thereof, and the use of 
such polypeptides for the serodiagnosis of Mycobacterium tuberculosis infection. 



BACKGROUND OF THE INVENTION 

10 Tuberculosis is a chronic, infectious disease, that is generally caused by 

infection with Mycobacterium tuberculosis. It is a major disease in developing countries, as 
well as an increasing problem in developed areas of the world, with about 8 million new 
cases and 3 million deaths each year. Although the infection may be asymptomatic for a 
considerable period of time, the disease is most commonly manifested as an acute 

15 inflammation of the lungs, resulting in fever and a nonproductive cough. If left untreated, 
serious complications and death typically result. 

Although tuberculosis can generally be controlled using extended antibiotic 
therapy, such treatment is not sufficient to prevent the spread of the disease. Infected 
individuals may be asymptomatic, but contagious, for some time. In addition, although 

20 compliance with the treatment regimen is critical, patient behavior is difficult to monitor. 
Some patients do not complete the course of treatment, which can lead to ineffective 
treatment and the development of drug resistance. 

Inhibiting the spread of tuberculosis will require effective vaccination and 
accurate, early diagnosis of the disease. Currently, vaccination with live bacteria is the most 

25 efficient method for inducing protective immunity. The most common Mycobacterium for 
this purpose is Bacillus Calmette-Guerin (BCG), an avirulent strain of Mycobacterium bovis. 
However, the safety and efficacy of BCG is a source of controversy and some countries, such 
as the United States, do not vaccinate the general public. Diagnosis is commonly achieved 
using a skin test, which involves intradermal exposure to tuberculin PPD (protein-purified 

30 derivative). Antigen-specific T cell responses result in measurable incubation at the injection 
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site by 48-72 hours after injection, which indicates exposure to Mycobacterial antigens. 
Sensitivity and specificity have, however, been a problem with this test, and individuals 
vaccinated with BCG cannot be distinguished from infected individuals. 

While macrophages have been shown to act as the principal effectors of 
5 M. tuberculosis immunity, T cells are the predominant inducers of such immunity. The 
essential role of T cells in protection against M. tuberculosis infection is illustrated by the 
frequent occurrence of M tuberculosis in AIDS patients, due to the depletion of CD4 T cells 
associated with human immunodeficiency virus (HIV) infection. Mycobacterium-reactive 
CD4 T cells have been shown to be potent producers of gamma-interferon (IFN-y), which, in 

10 turn, has been shown to trigger the anti-mycobacterial effects of macrophages in mice. While 
the role of IFN-y in humans is less clear, studies have shown that 1 ,25-dihydroxy-vitamin D3, 
either alone or in combination with IFN-y or tumor necrosis factor-alpha, activates human 
macrophages to inhibit M. tuberculosis infection. Furthermore, it is known that IFN-y 
stimulates human macrophages to make l ,25-dihydroxy-vitamin D3. Similarly, IL-12 has 

15 been shown to play a role in stimulating resistance to M. tuberculosis infection. For a review 
of the immunology of M tuberculosis infection see Chan and Kaufmann, in Tuberculosis: 
Pathogenesis, Protection and Control, Bloom (ed.), ASM Press, Washington, DC, 1994. 

Accordingly, there is a need in the art for improved diagnostic methods for 
detecting tuberculosis. The present invention fulfills this need and further provides other 

20 related advantages. 



SUMMARY OF THE INVENTION 

Briefly stated, the present invention provides compositions and methods for 
diagnosing tuberculosis. In one aspect, polypeptides are provided comprising an antigenic 
25 portion of a soluble M tuberculosis antigen, or a variant of such an antigen that differs only 
in conservative substitutions and/or modifications. In one embodiment of this aspect, the 
soluble antigen has one of the following N-terminal sequences: 

(a) Asp-Pro-Val-Asp-Ala-Val-Ile-Asn-Thr-Thr-Cys-Asn-Tyr-Gly-Gln- 
Val-Val-Ala-Ala-Leu (SEQ ID NO: 115); 



WO 98/16645 



3 



PCT/US97/18214 



(b) Ala-Val-Glu-Ser-Gly-Met-Leu-Ala-Leu-Gly-Thr-Pro-Ala-Pro-Ser 

(SEQIDNO: 116); 

(c) Ala-Ala-Met-Lys-Pro-Arg-Thr-Gly-Asp-Gly-Pro-Leu-Glu-Ala-Ala- 

Lys-Glu-Gly-Arg (SEQ ID NO: 1 17); 

(d) Tyr-Tyr-Trp-Cys-Pro-Gly-Gln-Pro-Phe-Asp-Pro-Ala-Trp-Gly-Pro 

(SEQIDNO: 118); 

(e) Asp-Ue-Gly-Ser-Glu-Ser-Thr-Glu-Asp-Gln-Gln-Xaa-Ala-Val (SEQ ID 
NO: 119); 

(f) Ala-Glu-Glu-Ser-Ile-Ser-Thr-Xaa-Glu-Xaa-Ile-Val-Pro (SEQ ID 
NO: 120); 

(g) Asp-Pro-Glu-Pro-Ala-Pro-Pro-Val-Pro-Thr-Thr-Ala-Ala-Ser-Pro-Pro- 

Ser(SEQ ID NO: 121); 

(h) Aia-Pro-Lys-Thr-Tyr-Xaa-Glu-Glu-Leu-Lys-Gly-Thr-Asp-Thr-Gly 

(SEQIDNO: 122); 

(i) Asp-Pro-Ala-Ser-Ala-Pro-Asp-Val-Pro-Thr-Ala-Ala-Gln-Leu-Thr-Ser- 
Leu-Leu-Asn-Ser-Leu-Ala-Asp-Pro-Asn-Val-Ser-Phe-Ala-Asn (SEQ 

ID NO: 123); 

(j) Xaa-Asp-Ser-Glu-Lys-Ser-Ala-Thr-Ile-Lys-Val-Thr-Asp-Ala-Ser; 

(SEQIDNO: 129) 
(k) Ala-Gly-Asp-Thr-Xaa-Ile-Tyr-Ile-Val-Gly-Asn-Leu-Thr-Ala-Asp; 

(SEQIDNO: 130) or 
(1) Ala-Pro-Glu-Ser-Gly-Ala-Gly-Leu-Gly-Gly-Thr-Val-Gln-Ala-Gly; 

(SEQIDNO: 131) 
wherein Xaa may be any amino acid. 

In a related aspect, polypeptides are provided comprising an immunogenic 
portion of an M tuberculosis antigen, or a variant of such an antigen that differs only in 
conservative substitutions and/or modifications, the antigen having one of the following N- 
terminal sequences: 
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(m) Xaa-Tyr-Ile-Ala-Tyr-Xaa-Thr-Thr-Ala-Gly-Ile-Val-Pro-Gly-Lys-Ile- 

Asn-Val-His-Leu-Val; (SEQ ID NO: 132) or 
(n) Asp-Pro-Pro-Asp-Pro-His«Gln-Xaa-Asp-Met-Thr-Lys-Gly-Tyr-Tyr- 
Pro-Gly-Gly-Arg-Arg-Xaa-Phe; (SEQ ID NO: 124) 
5 wherein Xaa may be any amino acid. 

In another embodiment, the soluble M tuberculosis antigen comprises an 
amino acid sequence encoded by a DNA sequence selected from the group consisting of the 
sequences recited in SEQ ID NOS: 1, 2, 4-10, 13-25, 52, 94 and 96, the complements of said 
sequences, and DNA sequences that hybridize to a sequence recited in SEQ ID NOS: 1,2, 
10 4-10, 13-25, 52, 94 and 96 or a complement thereof under moderately stringent conditions. 

In a related aspect, the polypeptides comprise an antigenic portion of a 
M tuberculosis antigen, or a variant of such an antigen that differs only in conservative 
substitutions and/or modifications, wherein the antigen comprises an amino acid sequence 
encoded by a DNA sequence selected from the group consisting of the sequences recited in 
15 SEQ ID NOS: 26-51, 133, 134, 158-178 and 196, the complements of said sequences, and 
DNA sequences that hybridize to a sequence recited in SEQ ID NOS: 26-51, 133, 134, 158- 
1 78 and 196 or a complement thereof under moderately stringent conditions. 

In related aspects, DNA sequences encoding the above polypeptides, 
recombinant expression vectors comprising these DNA sequences and host cells transformed 
20 or transfected with such expression vectors are also provided. 

In another aspect, the present invention provides fusion proteins comprising a 
first and a second inventive polypeptide or, alternatively, an inventive polypeptide and a 
known M. tuberculosis antigen. 

In further aspects of the subject invention, methods and diagnostic kits are 
25 provided for detecting tuberculosis in a patient. The methods comprise: (a) contacting a 
biological sample with at least one of the above polypeptides; and (b) detecting in the sample 
the presence of antibodies that bind to the polypeptide or polypeptides, thereby detecting 
M. tuberculosis infection in the biological sample. Suitable biological samples include whole 
blood, sputum, serum, plasma, saliva, cerebrospinal fluid and urine. The diagnostic kits 
30 comprise one or more of the above polypeptides in combination with a detection reagent. 
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The present invention also provides methods for detecting M. tuberculosis 
infection comprising: (a) obtaining a biological sample from a patient; (b) contacting the 
sample with at least one oligonucleotide primer in a polymerase chain reaction, the 
oligonucleotide primer being specific for a DNA sequence encoding the above polypeptides; 
5 and (c) detecting in the sample a DNA sequence that amplifies in the presence of the first and 
second oligonucleotide primers. In one embodiment, the oligonucleotide primer comprises at 
least about 10 contiguous nucleotides of such a DNA sequence. 

In a further aspect, the present invention provides a method for detecting 
M. tuberculosis infection in a patient comprising: (a) obtaining a biological sample from the 
1 0 patient; (b) contacting the sample with an oligonucleotide probe specific for a DNA sequence 
encoding the above polypeptides; and (c) detecting in the sample a DNA sequence that 
hybridizes to the oligonucleotide probe. In one embodiment, the oligonucleotide probe 
comprises at least about 15 contiguous nucleotides of such a DNA sequence. 

In yet another aspect, the present invention provides antibodies, both 
1 5 polyclonal and monoclonal, that bind to the polypeptides described above, as well as methods 
for their use in the detection of M. tuberculosis infection. 

These and other aspects of the present invention will become apparent upon 
reference to the following detailed description and attached drawings. All references 
disclosed herein are hereby incorporated by reference in their entirety as if each was 
20 incorporated individually. 

BRIEF DESCRIPTION OF THE DRAWINGS AND SEQUENCE IDENTIFIERS 

Figure 1A and B illustrate the stimulation of proliferation and interferon-y 
production in T cells derived from a first and a second M. tuberculosis-immxme donor, 
25 respectively, by the 1 4 Kd, 20 Kd and 26 Kd antigens described in Example 1 . 

Figures 2A-D illustrate the reactivity of antisera raised against secretory M. 
tuberculosis proteins, the known M. tuberculosis antigen 85b and the inventive antigens 
Tb38-1 and TbH-9, respectively, with M. tuberculosis lysate (lane 2), M. tuberculosis 
secretory proteins (lane 3), recombinant Tb38-1 (lane 4), recombinant TbH-9 (lane 5) and 
30 recombinant 85b (lane 5). 
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10 



15 



Figure 3A illustrates the stimulation of proliferation in a TbH-9-specific T cell 
clone by secretory M. tuberculosis proteins, recombinant TbH-9 and a control antigen 
TbRall. 8 ' 

Figure 3B illustrates the stimulation of interferon-y production in a TbH-9 
specific T cell clone by secretory M. tuberculosis proteins, PPD and recombinant TbH-9 

Figure 4 illustrates the reactivity of two representative polypeptides with sera 
from M tuberculosis-M^ and uninfected individuals, as compared to the reactivity of 
bacterial lysate. 

Figure 5 shows the reactivity of four representative polypeptides with sera 
from M. tuberculosis-**^ and uninfected individuals, as compared ,o the reactivity of ,he 
38 kD antigen. .. 

Figure 6 shows the reactivity of recombinant 38 kD and TbRkl 1 antigens with 
sera from M tuberculosis patients, PPD positive donors and normal donors. 

Figure 7 shows the reactivity of the antigen TbRa2A with 38 kD negative sera. 
Figure 8 shows the reactivity of the antigen of SEQ ID NO: 60 with sera from 
M. tuberculosis patients and normal donors. 

Figure 9 illustrates the reactivity of the recombinant antigen TbH-29 (SEQ ID 
NO: 137) with sera from M. tuberculosis patients, PPD positive donors and normal donors as 
determined by indirect ELISA. 

20 FigUre 10 i,lustrates re **ivity of the recombinant antigen TbH-33 (SEQ 

ID NO: 140) with sera from^ tuberculosis patients and from normal donors, and with a pool 
of sera from M. tuberculosis patients, as determined both by direct and indirect ELISA 

Figure 11 illustrates the reactivity of increasing concentrations of the 
recombinant antigen TbH-33 (SEQ ID NO: 140) with sera from M. tuberculosis patients and 
25 from normal donors as determined by ELISA. 

SEQ. ID NO. 1 is the DNA sequence of TbRal . 
SEQ. ID NO. 2 is the DNA sequence of TbRal 0. 
SEQ. ID NO. 3 is the DNA sequence of TbRal 1 . 
30 SEQ. ID NO. 4 is the DNA sequence of TbRal2. 
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ID NO. 
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5 is the DNA sequence of TbRal3. 

6 is the DNA sequence of TbRal6. 

7 is the DNA sequence of TbRal7. 

8 is the DNA sequence of TbRal8. 

9 is the DNA sequence of TbRal9. 



10 
11 

12 
13 
14 
15 
16 
17 
18 
19 
20 
21 
22 
23 
24 
25 
26 
27 
28 
29 
30 
31 
32 
33 
34 



s the DNA sequence of TbRa24. 
s the DNA sequence of TbRa26. 
s the DNA sequence of TbRa28. 
s the DNA sequence of TbRa29. 
s the DNA sequence of TbRa2A. 
s the DNA sequence of TbRa3. 
s the DNA sequence of TbRa32. 
s the DNA sequence of TbRa35. 
s the DNA sequence of TbRa36. 
s the DNA sequence of TbRa4. 
s the DNA sequence of TbRa9. 
s the DNA sequence of TbRaB. 
s the DNA sequence of TbRaC. 
s the DNA sequence of TbRaD. 
s the DNA sequence of YYWCPG. 
s the DNA sequence of AAMK. 
s the DNA sequence of TbL-23. 
s the DNA sequence of TbL-24. 
s the DNA sequence of TbL-25. 
s the DNA sequence of TbL-28. 
s the DNA sequence of TbL-29. 
s the DNA sequence of TbH-5. 
s the DNA sequence of TbH-8. 
s the DNA sequence of TbH-9. 
s the DNA sequence of TbM-1. 
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SEQ. ID NO. 45 
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15 


SEQ. ID NO. 49 
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25 


SEQ. ID NO. 59 




SEQ. ID NO. 60 
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30 
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s the DNA sequence of TbM-3. 
s the DNA sequence of TbM-6. 
s the DNA sequence of TbM-7. 
s the DNA sequence of TbM-9. 
s the DNA sequence of TbM-12. 
s the DNA sequence of TbM-13. 
s the DNA sequence of TbM-14. 
s the DNA sequence of TbM-15. 
s the DNA sequence of TbH-4. 
s the DNA sequence of TbH-4-FWD. 
s the DNA sequence of TbH-12. 
s the DNA sequence of Tb38-1 . 
s the DNA sequence of Tb38-4. 
s the DNA sequence of TbL-17. 
s the DNA sequence of TbL-20. 
s the DNA sequence of TbL-21. 
s the DNA sequence of TbH-16. 
s the DNA sequence of DPEP. 
s the deduced amino acid sequence of DPEP. 
s the protein sequence of DPV N-terminal Antigen, 
s the protein sequence of AVGS N-terminal Antigen, 
s the protein sequence of AAMK N-terminal Antigen, 
s the protein sequence of YYWC N-terminal Antigen, 
s the protein sequence of DIGS N-terminal Antigen, 
s the protein sequence of AEES N-terminal Antigen, 
s the protein sequence of DPEP N-terminal Antigen, 
s the protein sequence of APKT N-terminal Antigen, 
s the protein sequence of DPAS N-terminal Antigen, 
s the deduced amino acid sequence of TbM-1 Peptide, 
s the deduced amino acid sequence of TbRal . 
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s the deduced amino acid sequence of TbRalO. 
s the deduced amino acid sequence of TbRal 1 . 
s the deduced amino acid sequence of TbRal 2. 
s the deduced amino acid sequence of TbRal 3. 
s the deduced amino acid sequence of TbRal 6. 
s the deduced amino acid sequence of TbRal 7. 
s the deduced amino acid sequence of TbRal 8. 
s the deduced amino acid sequence of TbRal 9. 
s the deduced amino acid sequence of TbRa24. 
s the deduced amino acid sequence of TbRa26. 
s the deduced amino acid sequence of TbRa28. 
s the deduced amino acid sequence of TbRa29. 
s the deduced amino acid sequence of TbRa2A. 
s the deduced amino acid sequence of TbRa3. 
s the deduced amino acid sequence of TbRa32. 
s the deduced amino acid sequence of TbRa35. 
s the deduced amino acid sequence of TbRa36, 
s the deduced amino acid sequence of TbRa4. 
s the deduced amino acid sequence of TbRa9. 
s the deduced amino acid sequence of TbRaB. 
s the deduced amino acid sequence of TbRaC. 
s the deduced amino acid sequence of TbRaD. 
s the deduced amino acid sequence of YYWCPG. 
s the deduced amino acid sequence of TbAAMK. 
s the deduced amino acid sequence of Tb38-1. 
s the deduced amino acid sequence of TbH-4. 
s the deduced amino acid sequence of TbH-8. 
s the deduced amino acid sequence of TbH-9. 
s the deduced amino acid sequence of TbH-12. 
s the DNA sequence of DP AS. 
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95 is the deduced amino acid sequence of DPAS. 

96 is the DNA sequence of DPV. 

97 is the deduced amino acid sequence of DPV. 

98 is the DNA sequence of ESAT-6. 

99 is the deduced amino acid sequence of ESAT-6. 



100 
101 
102 
103 
104 
105 
106 
107 
108 
109 
110 
111 
112 
113 
114 
115 
116 
117 
118 
119 
120 
121 
122 
123 
124 



s the DNA sequence of TbH-8-2. 

s the DNA sequence of TbH-9FL. 

s the deduced amino acid sequence of TbH-9FL. 

s the DNA sequence of TbH-9-1 . 

s the deduced amino acid sequence of TbH-9-1 . . 

s the DNA sequence of TbH-9-4. J 

s the deduced amino acid sequence of TbH-9-4. 

s the DNA sequence of Tb38-1F2 IN. 

s the DNA sequence of Tb38-1F2 RP. 

s the deduced amino acid sequence of Tb37-FL. 

s the deduced amino acid sequence of Tb38-IN. 

s the DNA sequence of Tb38-1F3. 

s the deduced amino acid sequence of Tb38-1F3. 

s the DNA sequence of Tb38-1F5. 

s the DNA sequence of Tb38-1F6. 

s the deduced N-terminal amino acid sequence of DPV. 

s the deduced N-terminal amino acid sequence of AVGS. 

s the deduced N-terminal amino acid sequence of AAMK. 

s the deduced N-terminal amino acid sequence of YYWC. 

s the deduced N-terminal amino acid sequence of DIGS. 

s the deduced N-terminal amino acid sequence of AAES. 

s the deduced N-terminal amino acid sequence of DPEP. 

s the deduced N-terminal amino acid sequence of APKT. 

s the deduced N-terminal amino acid sequence of DPAS. 

s the protein sequence of DPPD N-terminal Antigen. 
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SEQ ID NO. 129 
SEQ ID NO. 130 
SEQ ID NO. 131 
SEQ ID NO. 132 
SEQ ID NO. 133 
SEQ ID NO. 134 
SEQ ID NO. 135 
SEQ ID NO. 136 
SEQ ID NO. 137 
SEQ ID NO. 138 
SEQ ID NO. 139 
SEQ ID NO. 140 



30 



SEQ ID NO. 125-128 are the protein sequences of four DPPD cyanogen bromide 
fragments. 

is the N-terminal protein sequence of XDS antigen, 
is the N-terminal protein sequence of AGD antigen, 
is the N-terminal protein sequence of APE antigen, 
is the N-terminal protein sequence of XY1 antigen, 
is the DNA sequence of TbH-29. 
is the DNA sequence of TbH-30. 
is the DNA sequence of TbH-32. 
is the DNA sequence of TbH-33. 
is the predicted amino acid sequence of TbH-29. 
is the predicted amino acid sequence of TbH-30. 
is the predicted amino acid sequence of TbH-32. 
is the predicted amino acid sequence of TbH-33. 
SEQ ID NO: 141-146 are PCR primers used in the preparation of a fusion protein 
containing TbRa3, 38 kD and Tb38-1. 

SEQ ID NO: 147 is the DNA sequence of the fusion protein containing TbRa3, 38 kD 
andTb38-l. 

SEQ ID NO: 148 is the amino acid sequence of the fusion protein containing TbRa3, 
38kDandTb38-l. 

SEQ ID NO: 149 is the DNA sequence of the M. tuberculosis antigen 38 kD. 

SEQ ID NO: 150 is the amino acid sequence of the M. tuberculosis antigen 38 kD. 

SEQ ID NO: 151 is the DNA sequence of XP14. 

SEQ ID NO: 152 is the DNA sequence of XP24. 

SEQ ID NO: 1 53 is the DNA sequence of XP3 1 . 

SEQ ID NO: 154 is the 5' DNA sequence of XP32. 

SEQ ID NO: 155 is the 3' DNA sequence of XP32. 

SEQ ID NO: 156 is the predicted amino acid sequence of XP14. 

SEQ ID NO: 157 is the predicted amino acid sequence encoded by the reverse 
complement of XP14. 
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complement of TbH4-XP 1 . 

SEQ ID NO: 181 is a first predicted amino acid sequence encoded by XP36. 
SEQ ID NO: 182 is a second predicted amino acid sequence encoded by XP36. 
SEQ ID NO: 183 is the predicted amino acid sequence encoded by the reverse 
complement of XP36. 

SEQ ID NO: 1 84 is the DNA sequence of RDIF2. 
SEQ ID NO: 185 is the DNA sequence of RDIF5. 
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containing TbRa3, 38 kD, Tb38-1 and DPEP (hereinafter referred to as TbF-2). 
SEQ ID NO: 208 is the DNA sequence of the fusion protein TbF-2. 
SEQ ID NO: 209 is the amino acid sequence of the fusion protein TbF-2. 
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DETAILED DESCRIPTION OF THE INVENTION 

As noted above, the present invention is generally directed to compositions 
and methods for diagnosing tuberculosis. The compositions of the subject invention include 
polypeptides that comprise at least one antigenic portion of a M. tuberculosis antigen, or a 

25 variant of such an antigen that differs only in conservative substitutions and/or modifications. 
Polypeptides within the scope of the present invention include, but are not limited to, soluble 
M. tuberculosis antigens. A "soluble M. tuberculosis antigen" is a protein of M. tuberculosis 
origin that is present in M. tuberculosis culture filtrate. As used herein, the term 
"polypeptide" encompasses amino acid chains of any length, including full length proteins 

30 (i.e., antigens), wherein the amino acid residues are linked by covalent peptide bonds. Thus, 
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a polypeptide comprising an antigenic portion of one of the above antigens may consist 
entirely of the antigenic portion, or may contain additional sequences. The additional 
sequences may be derived from the native M. tuberculosis antigen or may be heterologous, 
and such sequences may (but need not) be antigenic. 
5 An "antigenic portion" of an antigen (which may or may not be soluble) is a 

portion that is capable of reacting with sera obtained from an M. tuberculosis-infected 
individual (i.e., generates an absorbance reading with sera from infected individuals that is at 
least three standard deviations above the absorbance obtained with sera from uninfected 
individuals, in a representative ELISA assay described herein). An "M tuberculosis-infected 

10 individual" is a human who has been infected with M tuberculosis (e.g., has an intradermal 
skin test response to PPD that is at least 0.5 cm in diameter). Infected individuals may 
display symptoms of tuberculosis or may be free of disease symptom's. Polypeptides 
comprising at least an antigenic portion of one or more M. tuberculosis antigens as described 
herein may generally be used, alone or in combination, to detect tuberculosis in a patient. 

15 The compositions and methods of this invention also encompass variants of 

the above polypeptides. A "variant," as used herein, is a polypeptide that differs from the 
native antigen only in conservative substitutions and/or modifications, such that the antigenic 
properties of the polypeptide are retained. Such variants may generally be identified by 
modifying one of the above polypeptide sequences, and evaluating the antigenic properties of 

20 the modified polypeptide using, for example, the representative procedures described herein. 

A "conservative substitution" is one in which an amino acid is substituted for 
another amino acid that has similar properties, such that one skilled in the art of peptide 
chemistry would expect the secondary structure and hydropathic nature of the polypeptide to 
be substantially unchanged. In general, the following groups of amino acids represent 

25 conservative changes: (1) ala, pro, gly, glu, asp, gin, asn, ser, thr; (2) cys, ser, tyr, thr; (3) val, 
ile, leu, met, ala, phe; (4) lys, arg, his; and (5) phe, tyr, tip, his. 

Variants may also (or alternatively) be modified by, for example, the deletion 
or addition of amino acids that have minimal influence on the antigenic properties, secondary 
structure and hydropathic nature of the polypeptide. For example, a polypeptide may be 

30 conjugated to a signal (or leader) sequence at the N-terminal end of the protein which co- 
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translationally or post-translationally directs transfer of the protein. The polypeptide may 
also be conjugated to a linker or other sequence for ease of synthesis, purification or 
identification of the polypeptide (e.g., poly-His), or to enhance binding of the polypeptide to a 
solid support. For example, a polypeptide may be conjugated to an immunoglobulin Fc 
5 region. 

In a related aspect, combination polypeptides are disclosed. A "combination 
polypeptide" is a polypeptide comprising at least one of the above antigenic portions and one 
or more additional antigenic M. tuberculosis sequences, which are joined via a peptide 
linkage into a single amino acid chain. The sequences may be joined directly (i.e., with no 

10 intervening amino acids) or may be joined by way of a linker sequence (ejg. 9 Gly-Cys-Gly) 
that does not significantly diminish the antigenic properties of the component polypeptides. 

In general, M. tuberculosis antigens, and DNA sequences encoding such 
antigens, may be prepared using any of a variety of procedures. For example, soluble 
antigens, may be isolated from M tuberculosis culture filtrate by procedures known to those 

15 of ordinary skill in the art, including anion-exchange and reverse phase chromatography. 
Purified antigens may then be evaluated for a desired property, such as the ability to react 
with sera obtained from an M. tuberculosis-infected individual. Such screens may be 
performed using the representative methods described herein. Antigens may then be partially 
sequenced using, for example, traditional Edman chemistry. See Edman and Berg, Eur. J. 

20 Biochem. 80:1 16-132, 1967. 

Antigens may also be produced recombinantly using a DNA sequence that 
encodes the antigen, which has been inserted into an expression vector and expressed in an 
appropriate host. DNA molecules encoding soluble antigens may be isolated by screening an 
appropriate M. tuberculosis expression library with anti-sera (e.g., rabbit) raised specifically 

25 against soluble M. tuberculosis antigens. DNA sequences encoding antigens that may or may 
not be soluble may be identified by screening an appropriate M. tuberculosis genomic or 
cDNA expression library with sera obtained from patients infected with M. tuberculosis. 
Such screens may generally be performed using techniques well known in the art, such as 
those described in Sambrook et al., Molecular Cloning: A Laboratory Manual, Cold Spring 

30 Harbor Laboratories, Cold Spring Harbor, NY, 1 989. 
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DNA sequences encoding soluble antigens may also be obtained by screening 
an appropriate M tuberculosis cDNA or genomic DNA library for DNA sequences that 
hybridize to degenerate oligonucleotides derived from partial amino acid sequences of 
isolated soluble antigens. Degenerate oligonucleotide sequences for use in such a screen may 
5 be designed and synthesized, and the screen may be performed, as described (for example) in 
Sambrook etal., Molecular Cloning: A Laboratory Manual, Cold Spring Harbor 
Laboratories, Cold Spring Harbor, NY (and references cited therein). Polymerase chain 
reaction (PCR) may also be employed, using the above oligonucleotides in methods well 
known in the art, to isolate a nucleic acid probe from a cDNA or genomic library. The library 
10 screen may then be performed using the isolated probe. 

Regardless of the method of preparation, the antigens described herein are 
"antigenic." More specifically, the antigens have the ability to react with sera obtained from 
an M. tuberculosis-infected individual. Reactivity may be evaluated using, for example, the 
representative ELISA assays described herein, where an absorbance reading with sera from 
5 infected individuals that is at least three standard deviations above the absorbance obtained 
with sera from uninfected individuals is considered positive. 

Antigenic portions of M. tuberculosis antigens may be prepared and identified 
using well known techniques, such as those summarized in Paul, Fundamental Immunology, 
3d ed., Raven Press, 1993, pp. 243-247 and references cited therein. Such techniques include^ 
screening polypeptide portions of the native antigen for antigenic properties. The 
representative ELISAs described herein may generally be employed in these screens. An 
antigenic portion of a polypeptide is a portion that, within such representative assays, 
generates a signal in such assays that is substantially similar to that generated by the full 
length antigen. In other words, an antigenic portion of a M tuberculosis antigen generates at 
least about 20%, and preferably about 100%, of the signal induced by the full length antigen 
in a model ELISA as described herein. 

Portions and other variants of M. tuberculosis antigens may be generated by 
synthetic or recombinant means. Synthetic polypeptides having fewer than about 100 amino 
acids, and generally fewer than about 50 amino acids, may be generated using techniques 
well known in the art. For example, such polypeptides may be synthesized using any of the 
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commercially available solid-phase techniques, such as the Merrifield solid-phase synthesis 
method, where amino acids are sequentially added to a growing amino acid chain. See 
Merrifield, J. Am. Chem. Soc. 55:2149-2146, 1963. Equipment for automated synthesis of 
polypeptides is commercially available from suppliers such as Applied BioSystems, Inc., 
5 Foster City, CA, and may be operated according to the manufacturer's instructions. Variants 
of a native antigen may generally be prepared using standard mutagenesis techniques, such as 
oligonucleotide-directed site-specific mutagenesis. Sections of the DNA sequence may also 
be removed using standard techniques to permit preparation of truncated polypeptides. 

Recombinant polypeptides containing portions and/or variants of a native 

10 antigen may be readily prepared from a DNA sequence encoding the polypeptide using a 
variety of techniques well known to those of ordinary skill in the art. For example, 
supernatants from suitable host/vector systems which secrete recombinant protein into culture 
media may be first concentrated using a commercially available filter. Following 
concentration, the concentrate may be applied to a suitable purification matrix such as an 

15 affinity matrix or an ion exchange resin. Finally, one or more reverse phase HPLC steps can 
be employed to further purify a recombinant protein. 

Any of a variety of expression vectors known to those of ordinary skill in the 
art may be employed to express recombinant polypeptides as described herein. Expression 
may be achieved in any appropriate host cell that has been transformed or transfected with an 

20 expression vector containing a DNA molecule that encodes a recombinant polypeptide. 
Suitable host cells include prokaryotes, yeast and higher eukaryotic cells. Preferably, the host 
cells employed are E. coli, yeast or a mammalian cell line, such as COS or CHO. The DNA 
sequences expressed in this manner may encode naturally occurring antigens, portions of 
naturally occurring antigens, or other variants thereof. 

25 In general, regardless of the method of preparation, the polypeptides disclosed 

herein are prepared in substantially pure form. Preferably, the polypeptides are at least about 
80% pure, more preferably at least about 90% pure and most preferably at least about 99% 
pure. For use in the methods described herein, however, such substantially pure polypeptides 
may be combined. 



BNSDOCID: <WO 9816645A2J_> 



WO 98/16645 



18 



PCT/US97/18214 



In certain specific embodiments, the subject invention discloses polypeptides 
comprising at least an antigenic portion of a soluble M, tuberculosis antigen (or a variant of 
such an antigen), where the antigen has one of the following N-terminal sequences: 

(a) Asp-Pro-Val-Asp-Ala-Val-Ile-Asn-Thr-Thr-Cys-Asn-Tyr-Gly-Gln- 
Val-Val-Ala-Ala-Leu (SEQ ID NO: 115); 

(b) Ala-Val-Glu-Ser-Gly-Met-Leu-Ala-Leu-Gly-Thr-Pro-Ala-Pro-Ser 
(SEQ ID NO: 116); 

(c) Ala-Ala-Met-Lys-Pro-Arg-Thr-Gly-Asp-Gly-Pro-Leu-Glu-Ala-Ala- 
Lys-Glu-Gly-Arg (SEQ ID NO: 1 1 7); 

(d) Tyr-Tyr-Trp-Cys-Pro-Gly-Gln-Pro-Phe-Asp-Pro-Ala-Trp-Gly-Pro 
(SEQ ID NO: 118); 

(e) Asp-Ile-Gly-Ser-Glu-Ser-Thr-Glu-Asp-Gln-Gln-Xaa-Ala-Val (SEQ ID 
NO: 119); 

(f) Ala-Glu-Glu-Ser-Ile-Ser-Thr-Xaa-Glu-Xaa-Ile-Val-Pro (SEQ ID 
NO: 120); 

(g) Asp-Pro-Glu-Pro-Ala-Pro-Pro-Val-Pro-Thr-Thr-Ala-Ala-Ser-Pro-Pro- 
Ser (SEQ ID NO: 121); 

(h) Ala-Pro-Lys-Thr-Tyr-Xaa-Glu-Glu-Leu-Lys-Gly-Thr-Asp-Thr-Gly 
(SEQ ID NO: 122); 

(i) Asp-Pro-Ala-Ser-Ala-Pro-Asp-Val-Pro-Thr-Ala-Ala-Gln-Gln-Thr-Ser- 
Leu-Leu-Asn-Ser-Leu-Ala-Asp-Pro-Asn-Val-Ser-Phe-Ala-Asn (SEQ 
ID NO: 123); 

(j) Xaa-Asp-Ser-Glu-Lys-Ser-Ala-Thr-Ile-Lys-Val-Thr-Asp-Ala-Ser; 

(SEQ ID NO: 129) 
(k) Ala-Gly-Asp-Thr-Xaa-Ile-Tyr-Ile-Val-Gly-Asn-Leu-Thr-Ala-Asp; 

(SEQ ID NO: 130) or 
(1) Ala-Pro-Glu-Ser-Gly-Ala-Gly-Leu-Gly-Gly-Thr-Val-Gln-Ala-Gly; 

(SEQ ID NO: 131) 

wherein Xaa may be any amino acid, preferably a cysteine residue. A DNA sequence 
encoding the antigen identified as (g) above is provided in SEQ ID NO: 52, the deduced 
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amino acid sequence of which is provided in SEQ ID NO: 53. A DNA sequence encoding 
the antigen identified as (a) above is provided in SEQ ID NO: 96; its deduced amino acid 
sequence is provided in SEQ ID NO: 97. A DNA sequence corresponding to antigen (d) 
above is provided in SEQ ID NO: 24, a DNA sequence corresponding to antigen (c) is 
5 provided in SEQ ID NO: 25 and a DNA sequence corresponding to antigen (I) is disclosed in 
SEQ ID NO: 94 and its deduced amino acid sequence is provided in SEQ ID NO: 95. 

In a further specific embodiment, the subject invention discloses polypeptides 
comprising at least an immunogenic portion of an M. tuberculosis antigen having one of the 
following N-terminal sequences, or a variant thereof that differs only in conservative 
10 substitutions and/or modifications: 

(m) Xaa-Tyr-Ile-Ala-Tyr-Xaa-Thr-Thr-Ala-Gly-Ile-Val-Pro-Gly-Lys-Ile- 

Asn-Val-His-Leu-Val; (SEQ ID NO: 132) or 

_ (n) Asp-Pro-Pro-Asp-Pro-His-Gln-Xaa-Asp-Met-Thr-Lys-Gly-Tyr-Tyr- 

1 5 Pro-Gly-Gly-Arg-Arg-Xaa-Phe; (SEQ ID NO: 1 24) 

wherein Xaa may be any amino acid, preferably a cysteine residue. 

In other specific embodiments, the subject invention discloses polypeptides 
comprising at least an antigenic portion of a soluble M. tuberculosis antigen (or a variant of 
such an antigen) that comprises one or more of the amino acid sequences encoded by (a) the 
20 DNA sequences of SEQ ID NOS: 1, 2, 4-10, 13-25, 52, 94 and 96, (b) the complements of 
such DNA sequences, or (c) DNA sequences substantially homologous to a sequence in (a) or 
(b). 

In further specific embodiments, the subject invention discloses polypeptides 
comprising at least an antigenic portion of a M tuberculosis antigen (or a variant of such an 
25 antigen), which may or may not be soluble, that comprises one or more of the amino acid 
sequences encoded by (a) the DNA sequences of SEQ ID NOS: 26-51, 133, 134, 158-178 and 
196, (b)the complements of such DNA sequences or (c) DNA sequences substantially 
homologous to a sequence in (a) or (b). 

In the specific embodiments discussed above, the M. tuberculosis antigens 
30 include variants that are encoded DNA sequences which are substantially homologous to one 
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or more of DNA sequences specifically recited herein. "Substantial homology," as used 
herein, refers to DNA sequences that are capable of hybridizing under moderately stringent 
conditions. Suitable moderately stringent conditions include prewashing in a solution of 5X 
SSC, 0.5% SDS, 1.0 mM EDTA (pH 8.0); hybridizing at 50°C-65°C, 5X SSC, overnight or, 
5 in the event of cross-species homology, at 45°C with 0.5X SSC; followed by washing twice 
at 65°C for 20 minutes with each of 2X, 0.5X and 0.2X SSC containing 0.1% SDS). Such 
hybridizing DNA sequences are also within the scope of this invention, as are nucleotide 
sequences that, due to code degeneracy, encode an immunogenic polypeptide that is encoded 
by a hybridizing DNA sequence. 

10 In a related aspect, the present invention provides fusion proteins comprising a 

first and a second inventive polypeptide or, alternatively, a polypeptide of the present 
invention and a known M. tuberculosis antigen, such as the 38 kD antigen described above or 
ESAT-6 (SEQ ID NOS: 98 and 99), together with variants of such fusion proteins. The 
fusion proteins of the present invention may also include a linker peptide between the first 

15 and second polypeptides. 

A DNA sequence encoding a fusion protein of the present invention is 
constructed using known recombinant DNA techniques to assemble separate DNA sequences 
encoding the first and second polypeptides into an appropriate expression vector. The 3' end 
of a DNA sequence encoding the first polypeptide is ligated, with or without a peptide linker, 

20 to the 5' end of a DNA sequence encoding the second polypeptide so that the reading frames 
of the sequences are in phase to permit mRN A translation of the two DNA sequences into a 
single fusion protein that retains the biological activity of both the first and the second 
polypeptides. 

A peptide linker sequence may be employed to separate the first and the 
25 second polypeptides by a distance sufficient to ensure that each polypeptide folds into its 
secondary and tertiary structures. Such a peptide linker sequence is incorporated into the 
fusion protein using standard techniques well known in the art. Suitable peptide linker 
sequences may be chosen based on the following factors: (1) their ability to adopt a flexible 
extended conformation; (2) their inability to adopt a secondary structure that could interact 
30 with functional epitopes on the first and second polypeptides; and (3) the lack of hydrophobic 
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or charged residues that might react with the polypeptide functional epitopes. Preferred 
peptide linker sequences contain Gly, Asn and Ser residues. Other near neutral amino acids, 
such as Thr and Ala may also be used in the linker sequence. Amino acid sequences which 
may be usefully employed as linkers include those disclosed in Maratea et al., Gene 40:39-46, 

5 1985; Murphy etal., Proc. Natl. Acad. Sci. USA 55:8258-8562, 1986; U.S. Patent 
No. 4,935,233 and U.S. Patent No. 4,751,180. The linker sequence may be from 1 to about 
50 amino acids in length. Peptide linker sequences are not required when the first and second 
polypeptides have non-essential N-terminal amino acid regions that can be used to separate 
the functional domains and prevent steric hindrance. 

10 In another aspect, the present invention provides methods for using the 

polypeptides described above to diagnose tuberculosis. In this aspect, methods are provided 
for detecting M. tuberculosis infection in a biological sample, using one or more of the above 
polypeptides, alone or in combination. In embodiments in which multiple polypeptides are 
employed, polypeptides other than those specifically described herein, such as the 38 kD 

15 antigen described in Andersen and Hansen, Infect. Immun. 57:2481-2488, 1989, may be 
included. As used herein, a "biological sample" is any antibody-containing sample obtained 
from a patient. Preferably, the sample is whole blood, sputum, serum, plasma, saliva, 
cerebrospinal fluid or urine. More preferably, the sample is a blood, serum or plasma sample 
obtained from a patient or a blood supply. The polypeptide(s) are used in an assay, as 

20 described below, to determine the presence or absence of antibodies to the polypeptide(s) in 
the sample, relative to a predetermined cut-off value. The presence of such antibodies 
indicates previous sensitization to mycobacterial antigens which may be indicative of 
tuberculosis. 

In embodiments in which more than one polypeptide is employed, the 
25 polypeptides used are preferably complementary (i.e., one component polypeptide will tend 
to detect infection in samples where the infection would not be detected by another 
component polypeptide). Complementary polypeptides may generally be identified by using 
each polypeptide individually to evaluate serum samples obtained from a series of patients 
known to be infected with M. tuberculosis. After determining which samples test positive (as 
30 described below) with each polypeptide, combinations of two or more polypeptides may be 
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formulated that are capable of detecting infection in most, or all, of the samples tested. Such 
polypeptides are complementary. For example, approximately 25-30% of sera from 
tuberculosis-infected individuals are negative for antibodies to any single protein, such as the 
38 kD antigen mentioned above. Complementary polypeptides may, therefore, be used in 
5 combination with the 38 kD antigen to improve sensitivity of a diagnostic test. 

There are a variety of assay formats known to those of ordinary skill in the art 
for using one or more polypeptides to detect antibodies in a sample. See, e.g., Harlow and 
Lane, Antibodies: A Laboratory Manual, Cold Spring Harbor Laboratory, 1988, which is 
incorporated herein by reference. In a preferred embodiment, the assay involves the use of 

10 polypeptide immobilized on a solid support to bind to and remove the antibody from the 
sample. The bound antibody may then be detected using a detection reagent that contains a 
reporter group. Suitable detection reagents include antibodies that bind to the 
antibody /polypeptide complex and free polypeptide labeled with a reporter group (e.g., in a 
semi-competitive assay). Alternatively, a competitive assay may be utilized, in which an 

15 antibody that binds to the polypeptide is labeled with a reporter group and allowed to bind to 
the immobilized antigen after incubation of the antigen with the sample. The extent to which 
components of the sample inhibit the binding of the labeled antibody to the polypeptide is 
indicative of the reactivity of the sample with the immobilized polypeptide. 

The solid support may be any solid material known to those of ordinary skill 

20 in the art to which the antigen may be attached. For example, the solid support may be a test 
well in a microtiter plate or a nitrocellulose or other suitable membrane. Alternatively, the 
support may be a bead or disc, such as glass, fiberglass, latex or a plastic material such as 
polystyrene or polyvinylchloride. The support may also be a magnetic particle or a fiber 
optic sensor, such as those disclosed, for example, in U.S. Patent No. 5,359,681. 

25 The polypeptides may be bound to the solid support using a variety of 

techniques known to those of ordinary skill in the art, which are amply described in the patent 
and scientific literature. In the context of the present invention, the term "bound" refers to 
both noncovalent association, such as adsorption, and covalent attachment (which may be a 
direct linkage between the antigen and functional groups on the support or may be a linkage 

30 by way of a cross-linking agent). Binding by adsorption to a well in a microtiter plate or to a 
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membrane is preferred. In such cases, adsorption may be achieved by contacting the 
polypeptide, in a suitable buffer, with the solid support for a suitable amount of time. The 
contact time varies with temperature, but is typically between about 1 hour and 1 day. In 
general, contacting a well of a plastic microtiter plate (such as polystyrene or 
5 polyvinylchloride) with an amount of polypeptide ranging from about 10 ng to about 1 jag, 
and preferably about 100 ng, is sufficient to bind an adequate amount of antigen. 

Covalent attachment of polypeptide to a solid support may generally be 
achieved by first reacting the support with a Afunctional reagent that will react with both the 
support and a functional group, such as a hydroxyl or amino group, on the polypeptide. For 

10 example, the polypeptide may be bound to supports having an appropriate polymer coating 
using benzoquinone or by condensation of an aldehyde group on the support with an amine 
and an active hydrogen on the polypeptide (see, e.g., Pierce Immunotechnology Catalog and 
Handbook, 1991, at A12-A13). 

In certain embodiments, the assay is an enzyme linked immunosorbent assay 

15 (ELISA). This assay may be performed by first contacting a polypeptide antigen that has 
been immobilized on a solid support, commonly the well of a microtiter plate, with the 
sample, such that antibodies to the polypeptide within the sample are allowed to bind to the 
immobilized polypeptide. Unbound sample is then removed from the immobilized 
polypeptide and a detection reagent capable of binding to the immobilized antibody- 

20 polypeptide complex is added. The amount of detection reagent that remains bound to the 
solid support is then determined using a method appropriate for the specific detection reagent. 

More specifically, once the polypeptide is immobilized on the support as 
described above, the remaining protein binding sites on the support are typically blocked. 
Any suitable blocking agent known to those of ordinary skill in the art, such as bovine serum 

25 albumin or Tween 20™ (Sigma Chemical Co., St. Louis, MO) may be employed. The 
immobilized polypeptide is then incubated with the sample, and antibody is allowed to bind 
to the antigen. The sample may be diluted with a suitable diluent, such as phosphate-buffered 
saline (PBS) prior to incubation. In general, an appropriate contact time (i.e., incubation 
time) is that period of time that is sufficient to detect the presence of antibody within a 

30 M. tuberculosis-infected sample. Preferably, the contact time is sufficient to achieve a level 
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of binding that is at least 95% of that achieved at equilibrium between bound and unbound 
antibody. Those of ordinary skill in the art will recognize that the time necessary to achieve 
equilibrium may be readily determined by assaying the level of binding that occurs over a 
period of time. At room temperature, an incubation time of about 30 minutes is generally 
5 sufficient. 

Unbound sample may then be removed by washing the solid support with an 
appropriate buffer, such as PBS containing 0.1% Tween 20™. Detection reagent may then be 
added to the solid support. An appropriate detection reagent is any compound that binds to 
the immobilized antibody-polypeptide complex and that can be detected by any of a variety 

10 of means known to those in the art. Preferably, the detection reagent contains a binding agent 

/ 

(such as, for example, Protein A, Protein G, immunoglobulin, lectin /or free antigen) 
conjugated to a reporter group. Preferred reporter groups include enzymes (such as 
horseradish peroxidase), substrates, cofactors, inhibitors, dyes, radionuclides, luminescent 
groups, fluorescent groups and biotin. The conjugation of binding agent to reporter group 
15 may be achieved using standard methods known to those of ordinary skill in the art. 
Common binding agents may also be purchased conjugated to a variety of reporter groups 
from many commercial sources {e.g., Zymed Laboratories, San Francisco, CA, and Pierce, 
Rockford, IL). 

The detection reagent is then incubated with the immobilized antibody- 
20 polypeptide complex for an amount of time sufficient to detect the bound antibody. An 
appropriate amount of time may generally be determined from the manufacturers instructions 
or by assaying the level of binding that occurs over a period of time. Unbound detection 
reagent is then removed and bound detection reagent is detected using the reporter group. 
The method employed for detecting the reporter group depends upon the nature of the 
25 reporter group. For radioactive groups, scintillation counting or autoradiographic methods 
are generally appropriate. Spectroscopic methods may be used to detect dyes, luminescent 
groups and fluorescent groups. Biotin may be detected using avidin, coupled to a different 
reporter group (commonly a radioactive or fluorescent group or an enzyme). Enzyme 
reporter groups may generally be detected by the addition of substrate (generally for a 
30 specific period of time), followed by spectroscopic or other analysis of the reaction products. 
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To determine the presence or absence of anti-M tuberculosis antibodies in the 
sample, the signal detected from the reporter group that remains bound to the solid support is 
generally compared to a signal that corresponds to a predetermined cut-off value. In one 
preferred embodiment, the cut-off value is the average mean signal obtained when the 
immobilized antigen is incubated with samples from an uninfected patient. In general, a 
sample generating a signal that is three standard deviations above the predetermined cut-off 
value is considered positive for tuberculosis. In an alternate preferred embodiment, the cut- 
off value is determined using a Receiver Operator Curve, according to the method of Sackett 
etal., Clinical Epidemiology: A Basic Science for Clinical Medicine, Little Brown and Co., 
1985, pp. 106-107. Briefly, in this embodiment, the cut-off value may be determined from a 
plot of pairs of true positive rates (i.e., sensitivity) and false positive rates (100%-specificity) 
that correspond to each possible cut-off value for the diagnostic test result. The cut-off value 
on the plot that is the closest to the upper left-hand corner (i.e., the value that encloses the 
largest area) is the most accurate cut-off value, and a sample generating a signal that is higher 
than the cut-off value determined by this method may be considered positive. Alternatively, 
the cut-off value may be shifted to the left along the plot, to minimize the false positive rate, 
or to the right, to minimize the false negative rate. In general, a sample generating a signal 
that is higher than the cut-off value determined by this method is considered positive for 
tuberculosis. 

In a related embodiment, the assay is performed in a rapid flow-through or 
strip test format, wherein the antigen is immobilized on a membrane, such as nitrocellulose. 
In the flow-through test, antibodies within the sample bind to the immobilized polypeptide as 
the sample passes through the membrane. A detection reagent (e.g., protein A-colloidal gold) 
then binds to the antibody-polypeptide complex as the solution containing the detection 
reagent flows through the membrane. The detection of bound detection reagent may then be 
performed as described above. In the strip test format, one end of the membrane to which 
polypeptide is bound is immersed in a solution containing the sample. The sample migrates 
along the membrane through a region containing detection reagent and to the area of 
immobilized polypeptide. Concentration of detection reagent at the polypeptide indicates the 
presence of anti-M tuberculosis antibodies in the sample. Typically, the concentration of 
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detection reagent at that site generates a pattern, such as a line, that can be read visually. The 
absence of such a pattern indicates a negative result. In general, the amount of polypeptide 
immobilized on the membrane is selected to generate a visually discernible pattern when the 
biological sample contains a level of antibodies that would be sufficient to generate a positive 
5 signal in an ELISA, as discussed above. Preferably, the amount of polypeptide immobilized 
on the membrane ranges from about 25 ng to about 1 ug, and more preferably from about 
50 ng to about 500 ng. Such tests can typically be performed with a very small amount (e.g., 
one drop) of patient serum or blood. 

Of course, numerous other assay protocols exist that are suitable for use with 
10 the polypeptides of the present invention. The above descriptions are intended to be 
exemplary only. 

In yet another aspect, the present invention provides antibodies to the 
inventive polypeptides. Antibodies may be prepared by any of a variety of techniques known 
to those of ordinary skill in the art. See, e.g., Harlow and Lane, Antibodies: A Laboratory 

15 Manual, Cold Spring Harbor Laboratory, 1988. In one such technique, an immunogen 
comprising the antigenic polypeptide is initially injected into any of a wide variety of 
mammals (e.g., mice, rats, rabbits, sheep and goats). In this step, the polypeptides of this 
invention may serve as the immunogen without modification. Alternatively, particularly for 
relatively short polypeptides, a superior immune response may be elicited if the polypeptide 

20 is joined to a carrier protein, such as bovine serum albumin or keyhole limpet hemocyanin. 
The immunogen is injected into the animal host, preferably according to a predetermined 
schedule incorporating one or more booster immunizations, and the animals are bled 
periodically. Polyclonal antibodies specific for the polypeptide may then be purified from 
such antisera by, for example, affinity chromatography using the polypeptide coupled to a 

25 suitable solid support. 

Monoclonal antibodies specific for the antigenic polypeptide of interest may 
be prepared, for example, using the technique of Kohler and Milstein, Eur. J. Immunol. 
<5:51 1-519, 1976, and improvements thereto. Briefly, these methods involve the preparation 
of immortal cell lines capable of producing antibodies having the desired specificity (i.e., 
30 reactivity with the polypeptide of interest). Such cell lines may be produced, for example, 
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from spleen cells obtained from an animal immunized as described above. The spleen cells 
are then immortalized by, for example, fusion with a myeloma cell fusion partner, preferably 
one that is syngeneic with the immunized animal. A variety of fusion techniques may be 
employed. For example, the spleen cells and myeloma cells may be combined with a 
5 nonionic detergent for a few minutes and then plated at low density on a selective medium 
that supports the growth of hybrid cells, but not myeloma cells. A preferred selection 
technique uses HAT (hypoxanthine, aminopterin, thymidine) selection. After a sufficient 
time, usually about 1 to 2 weeks, colonies of hybrids are observed. Single colonies are 
selected and tested for binding activity against the polypeptide. Hybridomas having high 

10 reactivity and specificity are preferred. j 

Monoclonal antibodies may be isolated from the supematants of growing 
hybridoma colonies. In addition, various techniques may be employed to enhance the yield, 
such as injection of the hybridoma cell line into the peritoneal cavity of a suitable vertebrate 
host, such as a mouse. Monoclonal antibodies may then be harvested from the ascites fluid or 

15 the blood. Contaminants may be removed from the antibodies by conventional techniques, 
such as chromatography, gel filtration, precipitation, and extraction. The polypeptides of this 
invention may be used in the purification process in, for example, an affinity chromatography 
step. 

Antibodies may be used in diagnostic tests to detect the presence of 
20 M. tuberculosis antigens using assays similar to those detailed above and other techniques 
well known to those of skill in the art, thereby providing a method for detecting 
M. tuberculosis infection in a patient. 

Diagnostic reagents of the present invention may also comprise DNA 
sequences encoding one or more of the above polypeptides, or one or more portions thereof. 
25 For example, at least two oligonucleotide primers may be employed in a polymerase chain 
reaction (PCR) based assay to amplify M. tuberculosis-specific cDNA derived from a 
biological sample, wherein at least one of the oligonucleotide primers is specific for a DNA 
molecule encoding a polypeptide of the present invention. The presence of the amplified 
cDNA is then detected using techniques well known in the art, such as gel electrophoresis. 
30 Similarly, oligonucleotide probes specific for a DNA molecule encoding a polypeptide of the 
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present invention may be used in a hybridization assay to detect the presence of an inventive 
polypeptide in a biological sample. 

As used herein, the term "oligonucleotide primer/probe specific for a DNA 
molecule" means an oligonucleotide sequence that has at least about 80%, preferably at least 
5 about 90% and more preferably at least about 95%, identity to the DNA molecule in question. 
Oligonucleotide primers and/or probes which may be usefully employed in the inventive 
diagnostic methods preferably have at least about 10-40 nucleotides. In a preferred 
embodiment, the oligonucleotide primers comprise at least about 10 contiguous nucleotides 
of a DNA molecule encoding one of the polypeptides disclosed herein. Preferably, 

10 oligonucleotide probes for use in the inventive diagnostic methods comprise at least about 15 
contiguous oligonucleotides of a DNA molecule encoding one of the polypeptides disclosed 
herein. Techniques for both PCR based assays and hybridization assays are well known in 
the art (see, for example, Mullis et al. Ibid; Ehrlich, Ibid). Primers or probes may thus be 
used to detect M tuberculosis-specific sequences in biological samples. DNA probes or 

15 primers comprising oligonucleotide .. sequences described above may be used alone, in 
combination with each other, or with previously identified sequences, such as the 38 kD 
antigen discussed above. 

The following Examples are offered by way of illustration and not by way of 

20 limitation. 

EXAMPLES 
EXAMPLE 1 

25 Purification and Characterization of Polypeptides 

FROM M TUBERCULOSIS CULTURE FILTRATE 

This example illustrates the preparation of M. tuberculosis soluble 
polypeptides from culture filtrate. Unless otherwise noted, all percentages in the following 
30 example are weight per volume. 
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M. tuberculosis (either H37Ra, ATCC No. 25177, or H37Rv, ATCC 
No. 25618) was cultured in sterile GAS media at 37°C for fourteen days. The media was 
then vacuum filtered (leaving the bulk of the cells) through a 0.45 \x filter into a sterile 2.5 L 
bottle. The media was then filtered through a 0.2 u. filter into a sterile 4 L bottle. NaN 3 was 
5 then added to the culture filtrate to a concentration of 0.04%. The bottles were then placed in 
a 4°C cold room. 

The culture filtrate was concentrated by placing the filtrate in a 12 L reservoir 
that had been autoclaved and feeding the filtrate into a 400 ml Amicon stir cell which had 
been rinsed with ethanol and contained a 10,000 kDa MWCO membrane. The pressure was 

10 maintained at 60 psi using nitrogen gas. This procedure reduced the 12 L volume to 

/ 

approximately 50 ml. ' 

The culture filtrate was then dialyzed into 0.1% ammonium bicarbonate using 
a 8,000 kDa MWCO cellulose ester membrane, with two changes of ammonium bicarbonate 
solution. Protein concentration was then determined by a commercially available BCA assay 
. 15 (Pierce, Rockford, IL). 

The dialyzed culture filtrate was then lyophilized, and the polypeptides 
resuspended in distilled water. The polypeptides were then dialyzed against 0.01 mM 1,3 
bis[tris(hydroxymethyl)-methylamino]propane, pH 7.5 (Bis-Tris propane buffer), the initial 
conditions for anion exchange chromatography. Fractionation was performed using gel 
20 profusion chromatography on a POROS 146 II Q/M anion exchange column 4.6 mm x 
100 mm (Perseptive BioSystems, Framingham, MA) equilibrated in 0.01 mM Bis-Tris 
propane buffer pH 7.5. Polypeptides were eluted with a linear 0-0.5 M NaCl gradient in the 
above buffer system. The column eluent was monitored at a wavelength of 220 nm. 

The pools of polypeptides eluting from the ion exchange column were 
25 dialyzed against distilled water and lyophilized. The resulting material was dissolved in 0. 1 % 
trifluoroacetic acid (TFA) pH 1.9 in water, and the polypeptides were purified on a Delta-Pak 
CI 8 column (Waters, Milford, MA) 300 Angstrom pore size, 5 micron particle size (3.9 x 
150 mm). The polypeptides were eluted from the column with a linear gradient from 0-60% 
dilution buffer (0.1% TFA in acetonitrile). The flow rate was 0.75 ml/minute and the HPLC 
30 eluent was monitored at 214 nm. Fractions containing the eluted polypeptides were collected 
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to maximize the purity of the individual samples. Approximately 200 purified polypeptides 
were obtained. 

The purified polypeptides were then screened for the ability to induce T-cell 
proliferation in PBMC preparations. The PBMCs from donors known to be PPD skin test 
5 positive and whose T cells were shown to proliferate in response to PPD and crude soluble 
proteins from MTB were cultured in medium comprising RPMI 1640 supplemented with 
10% pooled human serum and 50 |ag/ml gentamicin. Purified polypeptides were added in 
duplicate at concentrations of 0.5 to 10 ^ig/mL. After six days of culture in 96-well round- 
bottom plates in a volume of 200 yl, 50 |il of medium was removed from each well for 

10 determination of IFN-y levels, as described below. The plates were then pulsed with 
1 |iCi/well of tritiated thymidine for a further 18 hours, harvested and tritium uptake 
determined using a gas scintillation counter. Fractions that resulted in proliferation in both 
replicates three fold greater than the proliferation observed in cells cultured in medium alone 
were considered positive. 

15 IFN-y was measured using an enzyme-linked immunosorbent assay (ELISA). 

ELISA plates were coated with a mouse monoclonal antibody directed to human IFN-y 
(Chemicon) in PBS for four hours at room temperature. Wells were then blocked with PBS 
containing 5% (W/V) non-fat dried milk for 1 hour at room temperature. The plates were 
then washed six times in PBS/0.2% TWEEN-20 and samples diluted 1:2 in culture medium 

20 in the ELISA plates were incubated overnight at room temperature. The plates were again 
washed and a polyclonal rabbit anti-human IFN-y serum diluted 1:3000 in PBS/10% normal 
goat serum was added to each well. The plates were then incubated for two hours at room 
temperature, washed and horseradish peroxidase-coupled anti-rabbit IgG (Jackson Labs.) was 
added at a 1 :2000 dilution in PBS/5% non-fat dried milk. After a further two hour incubation 

25 at room temperature, the plates were washed and TMB substrate added. The reaction was 
stopped after 20 min with 1 N sulfuric acid. Optical density was determined at 450 nra using 
570 nm as a reference wavelength. Fractions that resulted in both replicates giving an OD 
two fold greater than the mean OD from cells cultured in medium alone, plus 3 standard 
deviations, were considered positive. 
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For sequencing, the polypeptides were individually dried onto 
Biobrene™ (Perkin Elmer/Applied BioSystems Division, Foster City, CA) treated glass fiber 
filters. The filters with polypeptide were loaded onto a Perkin Elmer/Applied BioSystems 
Division Procise 492 protein sequencer. The polypeptides were sequenced from the amino 
terminal and using traditional Edman chemistry. The amino acid sequence was determined 
for each polypeptide by comparing the retention time of the PTH amino acid derivative to the 
appropriate PTH derivative standards. 

Using the procedure described above, antigens having the following 

N-terminal sequences were isolated: 

(a) Asp-Pro- Val-Asp-Ala-Val-Ile-Asn-Thr-Thr-Xaa-Asn-Tyr-Gly-Gln- 

Val-Val-Ala-Ala-Leu (SEQ ID NO: 54); / 

(b) Ala-Val-Glu-Ser-Gly-Met-Leu-Ala-Leu-Gly-Thr-Pro-Ala-Pro-Ser 

(SEQ ID NO: 55); 

(C) Ala-Ala-Met-Lys-Pro-Arg-Thr-Gly-Asp-Gly-Pro-Leu-Glu-Ala-Ala- 

Lys-Glu-Gly-Arg (SEQ ID NO: 56); 

(d) Tyr-Tyr-Trp-Cys-Pro-Gly-Gln-Pro-Phe-Asp-Pro-Ala-Trp-Gly-Pro 

(SEQ ID NO: 57); 

(e) Asp-Ile-Gly-Ser-Glu-Ser-Thr-Glu-Asp-Gln-Gln-Xaa-Ala-Val (SEQ ID 
NO: 58); 

(f) Ala-Glu-Glu-Ser-Ile-Ser-Thr-Xaa-Glu-Xaa-Ile-Val-Pro (SEQ ID 
NO: 59); 

( g ) Asp-Pro-Glu-Pro-Ala-Pro-Pro-Val-Pro-Thr-Ala-Ala-Ala-Ala-Pro-Pro- 

Ala (SEQ ID NO: 60); and 

(h) Ala-Pro-Lys-Thr-Tyr-Xaa-Glu-Glu-Leu-Lys-Gly-Thr-Asp-Thr-Gly 

(SEQ ID NO: 61); 
wherein Xaa may be any amino acid. 

An additional antigen was isolated employing a microbore HPLC purification 
step in addition to the procedure described above. Specifically, 20 ul of a fraction comprising 
a mixture of antigens from the chromatographic purification step previously described, was 
purified on an Aquapore CI 8 column (Perkin Elmer/Applied Biosystems Division, Foster 
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City, CA) with a 7 micron pore size, column size 1 mm x 1 00 mm, in a Perkin Elmer/Applied 
Biosystems Division Model 172 HPLC. Fractions were eluted from the column with a linear 
gradient of 1%/minute of acetonitrile (containing 0.05% TFA) in water (0.05% TFA) at a 
flow rate of 80 ^minute. The eluent was monitored at 250 nm. The original fraction was 
5 separated into 4 major peaks plus other smaller components and a polypeptide was obtained 
which was shown to have a molecular weight of 12.054 Kd (by mass spectrometry) and the 
following N-terminal sequence: 

(i) As P-Pro-Ala-Ser-Ala-Pro-Asp.Val-Pro-Thr-Ala-Ala-Gln-Gm-Thr-Ser- 

Leu-Leu-Asn-Asn-Leu-Ala-Asp-Pro-Asp-Val-Ser-Phe-Ala-Asp (SEQ 
10 ID NO: 62). 

This polypeptide was shown to induce proliferation and IFN-y production in PBMC 
preparations using the assays described above. 

Additional soluble antigens were isolated from M tuberculosis culture filtrate 
as follows. M. tuberculosis culture filtrate was prepared as described above. Following 
dialysis against Bis-Tris propane buffer, at pH 5.5, fractionation was performed using anion 
exchange chromatography on a Poros QE column 4.6 x 100 mm (Perseptive Biosystems) 
equilibrated in Bis-Tris propane buffer pH 5.5. Polypeptides were eluted with a linear 0-1.5 
M NaCl gradient in the above buffer system at a flow rate of 10 ml/min. The column eluent 
was monitored at a wavelength of 214 nm. 

The fractions eluting from the ion exchange column were pooled and 
subjected to reverse phase chromatography using a Poros R2 column 4.6 x 100 mm 
(Perseptive Biosystems). Polypeptides were eluted from the column with a linear gradient 
from 0-100% acetonitrile (0.1% TFA) at a flow rate of 5 ml/min. The eluent was monitored 



15 



at214nm. 



25 



Fractions containing the eluted polypeptides were lyophilized and resuspended 
in 80 pi of aqueous 0.1% TFA and further subjected to reverse phase chromatography on a 
Vydac C4 column 4.6 x 150 mm (Western Analytical, Temecula, CA) with a linear gradient 
of 0-100% acetonitrile (0.1% TFA) at a flow rate of 2 ml/min. Eluent was monitored at 214 
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The fraction with biological activity was separated into one major peak plus 
other smaller components. Western blot of this peak onto PVDF membrane revealed three 
major bands of molecular weights 14 Kd, 20 Kd and 26 Kd. These polypeptides were 
determined to have the following N-terminal sequences, respectively: 
5 (j) Xaa-Asp-Ser-Glu-Lys-Ser-Ala-Thr-Ile-Lys-Val-Thr-Asp-Ala-Ser; 

(SEQID NO: 129) 
(k) Ala-Gly-Asp-Thr-Xaa-Ile-Tyr-Ile-Val-Gly-Asn-Leu-Thr-Ala-Asp; 

(SEQID NO: 130) and 
(1) Ala-Pro-Glu-Ser-Gly-Ala-Gly-Leu-Gly-Gly-Thr-Val-Gln-Ala-Gly; 

j q (SEQ ID NO: 1 3 1 ), wherein Xaa may be any amino acid. 

Using the assays described above, these polypeptides were shown to induce proliferation and 
IFN-y production in PBMC preparations. Figs. 1A and B show the results of such assays 
using PBMC preparations from a first and a second donor, respectively. 

DNA sequences that encode the antigens designated as (a), (c), (d) and (g) 

15 above were obtained by screening a M. tuberculosis genomic library using 32 P end labeled 
degenerate oligonucleotides corresponding to the N-terminal sequence and containing 
M. tuberculosis codon bias. The screen performed using a probe corresponding to antigen (a) 
above identified a clone having the sequence provided in SEQ ID NO: 96. The polypeptide 
encoded by SEQ ID NO: 96 is provided in SEQ ID NO: 97. The screen performed using a 

20 probe corresponding to antigen (g) above identified a clone having the sequence provided in 
SEQ ID NO: 52. The polypeptide encoded by SEQ ID NO: 52 is provided in SEQ ID 
NO: 53. The screen performed using a probe corresponding to antigen (d) above identified a 
clone having the sequence provided in SEQ ID NO: 24, and the screen performed with a 
probe corresponding to antigen (c) identified a clone having the sequence provided in SEQ ID 

25 NO: 25. 

The above amino acid sequences were compared to known amino acid 
sequences in the gene bank using the DNA STAR system. The database searched contains 
some 173,000 proteins and is a combination of the Swiss, PIR databases along with translated 
protein sequences (Version 87). No significant homologies to the amino acid sequences for 
30 antigens (a)-(h) and (1) were detected. 
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The amino acid sequence for antigen (i) was found to be homologous to a 
sequence from M leprae. The full length M. leprae sequence was amplified from genomic 
DNA using the sequence obtained from GENBANK. This sequence was then used to screen 
an M tuberculosis library and a full length copy of the M tuberculosis homologue was 
5 obtained (SEQ ID NO: 94). 

The amino acid sequence for antigen (j) was found to be homologous to a 
known M. tuberculosis protein translated from a DNA sequence. To the best of the 
inventors' knowledge, this protein has not been previously shown to possess T-cell 
stimulatory activity. The amino acid sequence for antigen (k) was found to be related to a 
10 sequence from M leprae. 

In the proliferation and IFN-y assays described above, using three PPD 
positive donors, the results for representative antigens provided above are presented in Table 
1: 

15 TABLE 1 

Results of PBMC Proliferation and IFN-y Assays 



Sequence 


Proliferation 


IFN-y 


(a) 


+ 




(c) 




+++ 


(d) 


++ 


++ 


(g) 


+++ 


+++ 


(h) 


+++ 


+++ 



In Table 1 , responses that gave a stimulation index (SI) of between 2 and 4 
20 (compared to cells cultured in medium alone) were scored as +, as SI of 4-8 or 2-4 at a 
concentration of 1 jxg or less was scored as ++ and an SI of greater than 8 was scored as +++. 
The antigen of sequence (i) was found to have a high SI (+++) for one donor and lower SI 
(++ and +) for the two other donors in both proliferation and IFN-y assays. These results 
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indicate that these antigens are capable of inducing proliferation and/or interferon-y 
production. 

EXAMPLE 2 

5 I Jse Of P atiknt Sera To Tsoi .ate M. Tuberculos is Antigens 

This example illustrates the isolation of antigens from M. tuberculosis lysate 
by screening with serum from M. tuberculosis-infected individuals. 

Dessicated M. tuberculosis H37Ra (Difco Laboratories) was added to a 2% 
10 NP40 solution, and alternately homogenized and sonicated three times.' The resulting 
suspension was centrifuged at 13,000 rpm in microfuge tubes and the supernatant put through 
a 0.2 micron syringe filter. The filtrate was bound to Macro Prep DEAE beads (BioRad, 
Hercules, CA). The beads were extensively washed with 20 mM Tris pH 7.5 and bound 
proteins eluted with 1M NaCl. The NaCl elute was dialyzed overnight against 10 mM Tris, 
15 pH 7.5. Dialyzed solution was treated with DNase and RNase at 0.05 mg/ml for 30 min. at 
room temperature and then with a-D-mannosidase, 0.5 U/mg at pH 4.5 for 3-4 hours at room 
temperature. After returning to pH 7.5, the material was fractionated via FPLC over a Bio 
Scale-Q-20 column (BioRad). Fractions were combined into nine pools, concentrated in a 
Centriprep 10 (Amicon, Beverley, MA) and screened by Western blot for serological activity 
20 using a serum pool from M. tuberculosis-infected patients which was not immunoreactive 
with other antigens of the present invention. 

The most reactive fraction was run in SDS-PAGE and transferred to PVDF. A 
band at approximately 85 Kd was cut out yielding the sequence: 

(m) Xaa-Tyr-Ile-Ala-Tyr-Xaa-Thr-Thr-Ala-Gly-Ile-Val-Pro-Gly-Lys-Ile- 
25 Asn-Val-His-Leu-Val; (SEQ ID NO: 132), wherein Xaa may be any 

amino acid. 

Comparison of this sequence with those in the gene bank as described above, 
revealed no significant homologies to known sequences. 

A DNA sequence that encodes the antigen designated as (m) above was 
30 obtained by screening a genomic M. tuberculosis Erdman strain library using labeled 
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degenerate oligonucleotides corresponding to the N-terminal sequence of SEQ ID NO: 137. A 
clone was identified having the DNA sequence provided in SEQ ID NO: 198. This sequence 
was found to encode the amino acid sequence provided in SEQ ID NO: 199. Comparison of 
these sequences with those in the genebank revealed some similarity to sequences previously 
5 identified in M. tuberculosis and M. bovis. 

EXAMPLE 3 

Preparation of DN A Sequences ENCODrNo M tuberc ulosis Antigens 

10 This example illustrates the preparation of DNA sequences encoding 

M. tuberculosis antigens by screening a M. tuberculosis expression library with sera obtained 
from patients infected with M. tuberculosis, or with anti-sera raised against M tuberculosis 
antigens. 

15 a. - Preparation of M. tuberculosis Soluble A ntigens using Rabbit Anti-sera 
Raised against M. tuberculos is Supernatant 

Genomic DNA was isolated from the M. tuberculosis strain H37Ra. The DNA 
was randomly sheared and used to construct an expression library using the Lambda ZAP 
expression system (Stratagene, La Jolla, CA). Rabbit anti-sera was generated against 

20 secretory proteins of the M. tuberculosis strains H37Ra, H37Rv and Erdman by immunizing a 
rabbit with concentrated supernatant of the M. tuberculosis cultures. Specifically, the rabbit 
was first immunized subcutaneously with 200 ug of protein antigen in a total volume of 2 ml 
containing 100 ug muramyl dipeptide (Calbiochem, La Jolla, CA) and 1 ml of incomplete 
Freund's adjuvant. Four weeks later the rabbit was boosted subcutaneously with 100 ug 

25 antigen in incomplete Freund's adjuvant. Finally, the rabbit was immunized intravenously 
four weeks later with 50 ug protein antigen. The anti-sera were used to screen the expression 
library as described in Sambrook et al., Molecular Cloning: A Laboratory Manual, Cold 
Spring Harbor Laboratories, Cold Spring Harbor, NY, 1989. Bacteriophage plaques 
expressing immunoreactive antigens were purified. Phagemid from the plaques was rescued 

30 and the nucleotide sequences of the M. tuberculosis clones deduced. 
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Thirty two clones were purified. Of these, 25 represent sequences that have 
not been previously identified in M. tuberculosis. Proteins were induced by IPTG and 
purified by gel elution, as described in Skeiky etal., J. Exp. Med. 181: 1527- 1537, 1995. 
Representative partial sequences of DNA molecules identified in this screen are provided in 
5 SEQ ID NOS: 1-25. The corresponding predicted amino acid sequences are shown in SEQ 
ID NOS: 64-88. 

On comparison of these sequences with known sequences in the gene bank 
using the databases described above, it was found that the clones referred to hereinafter as 
TbRA2A, TbRA16, TbRA18, and TbRA29 (SEQ ID NOS: 77, 69, 71, 76) show some 

10 homology to sequences previously identified in Mycobacterium leprae but not in 
M. tuberculosis. TbRAll, TbRA26, TbRA28 and TbDPEP (SEQ ID NO'S: 66, 74, 75, 53) 
have been previously identified in M. tuberculosis. No significant homologies were found to 
TbRAl, TbRA3, TbRA4, TbRA9, TbRAlO, TbRA13, TbRA17, TbRA19, TbRA29, 
TbRA32, TbRA36 and the overlapping clones TbRA35 and TbRAl 2 (SEQ ID NOS: 64, 78, 

15 82, 83, 65, 68, 76, 72, 76, 79, 81, 80, 67, respectively). The clone TbRa24 is overlapping 
with clone TbRa29. 

B. use of Sera from Patients ha vino Pulmonary or Pl eural Tuberculosis to 
Identify DNA Sequences Encoding M tuberculosis Antigens 

20 The genomic DNA library described above, and an additional H37Rv library, 

were screened using pools of sera obtained from patients with active tuberculosis. To prepare 
the H37Rv library, M. tuberculosis strain H37Rv genomic DNA was isolated, subjected to 
partial Sau3A digestion and used to construct an expression library using the Lambda Zap 
expression system (Stratagene, La Jolla, Ca). Three different pools of sera, each containing 

25 sera obtained from three individuals with active pulmonary or pleural disease, were used in 
the expression screening. The pools were designated TbL, TbM and TbH, referring to 
relative reactivity with H37Ra lysate (i.e., TbL = low reactivity, TbM = medium reactivity 
and TbH = high reactivity) in both ELISA and immunoblot format. A fourth pool of sera 
from seven patients with active pulmonary tuberculosis was also employed. All of the sera 
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lacked increased reactivity with the recombinant 38 kD M. tuberculosis H37Ra phosphate- 
binding protein. 

All pools were pre-adsorbed with E. coli lysate and used to screen the H37Ra 
and H37Rv expression libraries, as described in Sambrook etal., Molecular Cloning: A 
5 Laboratory Manual, Cold Spring Harbor Laboratories, Cold Spring Harbor, NY, 1989. 
Bacteriophage plaques expressing immunoreactive antigens were purified. Phagemid from 
the plaques was rescued and the nucleotide sequences of the M. tuberculosis clones deduced. 

Thirty two clones were purified. Of these, 3 1 represented sequences that had 
not been previously identified in human M. tuberculosis. Representative sequences of the 

10 DNA molecules identified are provided in SEQ ID NOS:: 26-5 1 and 100. Of these, TbH-8-2 
(SEQ. ID NO. 100) is a partial clone of TbH-8, and TbH-4 (SEQ. ID NO. 43) and TbH-4- 
FWD (SEQ. ID NO. 44) are non-contiguous sequences from the same clone. Amino acid 
sequences for the antigens hereinafter identified as Tb38-1, TbH-4, TbH-8, TbH-9, and 
TbH-12 are shown in SEQ ID NOS.: 89-93. Comparison of these sequences with known 

15 sequences in the gene bank using the databases identified above revealed no significant 
homologies to TbH-4, TbH-8, TbH-9 and TbM-3, although weak homologies were found to 
TbH-9. TbH-12 was found to be homologous to a 34 kD antigenic protein previously 
identified in M. paratuberculosis (Acc. No. S28515). Tb38-1 was found to be located 34 
base pairs upstream of the open reading frame for the antigen ES AT-6 previously identified 

20 in M. bovis (Acc. No. U34848) and in M. tuberculosis (Sorensen et al., Infec. Immun. 

63:1710-1717, 1995). 

Probes derived from Tb38-1 and TbH-9, both isolated from an H37Ra library, 
were used to identify clones in an H37Rv library. Tb38-1 hybridized to Tb38-1F2, Tb38- 
lF3,Tb38-lF5 and Tb38-1F6 (SEQ. ID NOS: 107, 108, 111, 113, and 114). (SEQ ID NOS: 

25 107 and 1 08 are non-contiguous sequences from clone Tb38-1F2.) Two open reading frames 
were deduced in Tb38-IF2; one corresponds to Tb37FL (SEQ. ID. NO. 109), the second, a 
partial sequence, may be the homologue of Tb38-1 and is called Tb38-IN (SEQ. ID NO. 110). 
The deduced amino acid sequence of Tb38-1F3 is presented in SEQ. ID. NO. 112. A TbH-9 
probe identified three clones in the H37Rv library: TbH-9-FL (SEQ. ID NO. 101), which 

30 may be the homologue of TbH-9 (R37Ra), TbH-9- 1 (SEQ. ID NO. 103), and TbH-8-2 (SEQ. 
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ID NO. 105) is a partial clone of TbH-8. The deduced amino acid sequences for these three 
clones are presented in SEQ ID NOS: 102, 104 and 106. 

Further screening of the M. tuberculosis genomic DNA library, as described 
above, resulted in the recovery often additional reactive clones, representing seven different 

5 genes. One of these genes was identified as the 38 Kd antigen discussed above, one was 
determined to be identical to the 14Kd alpha crystallin heat shock protein previously shown 
to be present in M. tuberculosis, and a third was determined to be identical to the antigen 
TbH-8 described above. The determined DNA sequences for the remaining five clones 
(hereinafter referred to as TbH-29, TbH-30, TbH-32 and TbH-33) are provided in SEQ ID 

10 NO: 133-136, respectively, with the corresponding predicted amino acid sequences being 
provided in SEQ ID NO: 137-140, respectively. The DNA and amino acid sequences for 
these antigens were compared with those in the gene bank as described above. No 
homologies were found to the 5' end of TbH-29 (which contains the reactive open reading 
frame), although the 3' end of TbH-29 was found to be identical to the M. tuberculosis 

15 cosmid Y227. TbH-32 and TbH-33 were found to be identical to the previously identified 
M. tuberculosis insertion element IS6110 and to the M. tuberculosis cosmid Y50, 
respectively. No significant homologies to TbH-30 were found. 

Positive phagemid from this additional screening were used to infect E. coli 
XL-1 Blue MRF, as described in Sambrook et al., supra. Induction of recombinant protein 

20 was accomplished by the addition of IPTG. Induced and uninduced lysates were run in 
duplicate on SDS-PAGE and transferred to nitrocellulose filters. Filters were reacted with 
human M. tuberculosis sera (1:200 dilution) reactive with TbH and a rabbit sera (1:200 or 
1:250 dilution) reactive with the N-terminal 4 Kd portion of lacZ. Sera incubations were 
performed for 2 hours at room temperature. Bound antibody was detected by addition of ,25 I- 

25 labeled Protein A and subsequent exposure to film for variable times ranging from 16 hours 
to 1 1 days. The results of the immunoblots are summarized in Table 2. 
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TABLE 2 



10 



Human M. tb Anti-lacZ 
Antigen Sera Sera 

TbH-29 45 Kd 45 Kd 

TbH-30 No reactivity 29 Kd 

TbH-32 12Kd 12 Kd 

TbH-33 16Kd 16Kd 



Positive reaction of the recombinant human M. tuberculosis antigens with both 
the human Af. tuberculosis sera and anti-lacZ sera indicate that reactivity of the human M 
tuberculosis sera is directed towards the fusion protein. Antigens reactive with the anti-lacZ 
sera but not with the human M. tuberculosis sera may be the result of the human M 

15 tuberculosis sera recognizing conformational epitopes, or the antigen-antibody binding 
kinetics may be such that the 2 hour sera exposure in the immunoblot is not sufficient. 

Studies were undertaken to determine whether the antigens TbH-9 and Tb38-1 
represent cellular proteins or are secreted into M tuberculosis culture media. In the first 
study, rabbit sera were raised against A) secretory proteins of M tuberculosis, B) the known 

20 secretory recombinant M. tuberculosis antigen 85b, C) recombinant Tb38-1 and D) 
recombinant TbH-9, using protocols substantially as described in Example 3A. Total M. 
tuberculosis lysate, concentrated supernatant of M, tuberculosis cultures and the recombinant 
antigens 85b, TbH-9 and Tb38-1 were resolved on denaturing gels, immobilized on 
nitrocellulose membranes and duplicate blots were probed using the rabbit sera described 

25 above. 

The results of this analysis using control sera (panel I) and antisera (panel II) 
against secretory proteins, recombinant 85b, recombinant Tb38-1 and recombinant TbH-9 are 
shown in Figures 2A-D, respectively, wherein the lane designations are as follows: 1) 
molecular weight protein standards; 2) 5 ^ig of M. tuberculosis lysate; 3) 5 |ag secretory 
30 proteins; 4) 50 ng recombinant Tb38-1; 5) 50 ng recombinant TbH-9; and 6) 50 ng 
recombinant 85b. The recombinant antigens were engineered with six terminal histidine 
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residues and would therefore be expected to migrate with a mobility approximately 1 kD 
larger that the native protein. In Figure 2D, recombinant TbH-9 is lacking approximately 10 
kD of the full-length 42 kD antigen, hence the significant difference in the size of the 
immunoreactive native TbH-9 antigen in the lysate lane (indicated by an arrow). These 
5 results demonstrate that Tb38-1 and TbH-9 are intracellular antigens and are not actively 

secreted by AS. tuberculosis. 

The finding that TbH-9 is an intracellular antigen was confirmed by 
determining the reactivity of TbH-9-specific human T cell clones to recombinant TbH-9, 
secretory AS. tuberculosis proteins and PPD. A TbH-9-specific T cell clone (designated 

10 131TbH-9) was generated from PBMC of a healthy PPD-positive donor. The proliferative 
response of 131TbH-9 to secretory proteins, recombinant TbH-9 and a control AS. 
tuberculosis antigen, TbRal 1, was determined by measuring uptake of tritiated thymidine, as 
described in Example 1. As shown in Figure 3A, the clone 131TbH-9 responds specifically 
to TbH-9, showing that TbH-9 is not a significant component of AS. tuberculosis secretory 

15 proteins. Figure 3B shows the production of IFN-v by a second TbH-9-specific T cell clone 
(designated PPD 800-10) prepared from PBMC from a healthy PPD-positive donor, 
following stimulation of the T cell clone with secretory proteins, PPD or recombinant TbH-9. 
These results further confirm that TbH-9 is not secreted by AS. tuberculosis. 

20 c Use of Sera From Patients having Extrap ulmonary Tuberculosis to Identify 
DNA Sequences Encoding AS. tub erculosis Antigens 

Genomic DNA was isolated from AS. tuberculosis Erdman strain, randomly 
sheared and used to construct an expression library employing the Lambda ZAP expression 
25 system (Stratagene, La Jolla, CA). The resulting library was screened using pools of sera 
obtained from individuals with extrapulmonary tuberculosis, as described above in Example 
3B, with the secondary antibody being goat anti-human IgG + A + M (H+L) conjugated with 

alkaline phosphatase. 

Eighteen clones were purified. Of these, 4 clones (hereinafter referred to as 
30 XP14, XP24, XP31 and XP32) were found to bear some similarity to known sequences. The 
determined DNA sequences for XP14, XP24 and XP31 are provided in SEQ ID NOS: 151- 
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153, respectively, with the 5 5 and 3' DNA sequences for XP32 being provided in SEQ ID 
NOS: 154 and 155, respectively. The predicted amino acid sequence for XP14 is provided in 
SEQ ID NO: 156. The reverse complement of XP14 was found to encode the amino acid 
sequence provided in SEQ ID NO: 157. 

5 Comparison of the sequences for the remaining 14 clones (hereinafter referred 

to as XP1-XP6, XP17-XP19, XP22, XP25, XP27, XP30 and XP36) with those in the 
genebank as described above, revealed no homologies with the exception of the 3' ends of 
XP2 and XP6 which were found to bear some homology to known M tuberculosis cosmids. 
The DNA sequences for XP27 and XP36 are shown in SEQ ID NOS: 158 and 159, 

10 respectively, with the 5' sequences for XP4, XP5, XP17 and XP30 being shown in SEQ ID 
NOS: 160-163, respectively, and the 5' and 3' sequences for XP2, XP3, XP6, XP18, XP19, 
XP22 and XP25 being shown in SEQ ID NOS: 164 and 165; 166 and 167; 168 and 169; 170 
and 171; 172 and 173; 174 and 175; and 176 and 177, respectively. XP1 was found to 
overlap with the DNA sequences for TbH4, disclosed above. The ftilMength DNA sequence 

15 -for TbH4-XPl is provided in SEQ ID NO: 178. This DNA sequence was found to contain an 
open reading frame encoding the amino acid sequence shown in SEQ ID NO: 179. The 
reverse complement of TbH4-XPl was found to contain an open reading frame encoding the 
amino acid sequence shown in SEQ ID NO: 1 80. The DNA sequence for XP36 was found to 
contain two open reading frames encoding the amino acid sequence shown in SEQ ID NOS: 

20 181 and 182, with the reverse complement containing an open reading frame encoding the 
amino acid sequence shown in SEQ ID NO: 183. 

Recombinant XP1 protein was prepared as described above in Example 3B, 
with a metal ion affinity chromatography column being employed for purification. 
Recombinant XP1 was found to stimulate cell proliferation and IFN-y production in T cells 

25 isolated from an M tuberculosis-immune donors. 

D. Preparation of M. tuberculosis Soluble Antigens using Rabbit Anti-sera 

RAISED AGAINST M. TUBERCULOSIS FRACTIONATED PROTEINS 

M. tuberculosis lysate was prepared as described above in Example 2. The 
30 resulting material was fractionated by HPLC and the fractions screened by Western blot for 
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serological activity with a serum pool from M. tuberculosis-infected patients which showed 
little or no immunoreactivity with other antigens of the present invention. Rabbit anti-sera 
was generated against the most reactive fraction using the method described in Example 3 A . 
The anti-sera was used to screen an M tuberculosis Erdman strain genomic DNA expression 
5 library prepared as described above. Bacteriophage plaques expressing immunoreactive 
antigens were purified. Phagemid from the plaques was rescued and the nucleotide sequences 
of the M. tuberculosis clones determined. 

Ten different clones were purified. Of these, one was found to be TbRa35, 
described above, and one was found to be the previously identified M tuberculosis antigen, 

10 HSP60. Of the remaining eight clones, six (hereinafter referred to as RDIF2, RDIF5, RDIF8, 
RDIF10, RDIF11 and RDIF12) were found to bear some similarity to previously identified 
M. tuberculosis sequences. The determined DNA sequences for RDIF2, RDIF5, RDIF8, 
RDIF10 and RDIF11 are provided in SEQ ID NOS: 184-188, respectively, with the 
corresponding predicted amino acid sequences being provided in SEQ ID NOS: 189-193, 

15 respectively. The 5 ? and 3' DNA sequences for RDIF12 are provided in SEQ ID NOS: 194 
and 195, respectively. No significant homologies were found to the antigen RDIF-7. The 
determined DNA and predicted amino acid sequences for RDIF7 are provided in SEQ ID 
NOS: 196 and 197, respectively. One additional clone, referred to as RDIF6 was isolated, 
however, this was found to be identical to RDIF5. 

20 Recombinant RDIF6, RDIF8, RDIF10 and RDIF11 were prepared as 

described above. These antigens were found to stimulate cell proliferation and IFN-y 
production in T cells isolated from M tuberculosis-immxme donors. 

25 EXAMPLE 4 

Purification and Characterization of a Polypep tide from Tuberculin Purified 

Protein Derivative 

An M tuberculosis polypeptide was isolated from tuberculin purified protein 
30 derivative (PPD) as follows. 
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PPD was prepared as published with some modification (Seibert, F. et al., 
Tuberculin purified protein derivative. Preparation and analyses of a large quantity for 
standard. The American Review of Tuberculosis 44:9-25, 1941). M. tuberculosis Rv strain 
was grown for 6 weeks in synthetic medium in roller bottles at 37°C. Bottles containing the 
5 bacterial growth were then heated to 1 00°C in water vapor for 3 hours. Cultures were sterile 
filtered using a 0.22 \x filter and the liquid phase was concentrated 20 times using a 3 kD cut- 
off membrane. Proteins were precipitated once with 50% ammonium sulfate solution and 
eight times with 25% ammonium sulfate solution. The resulting proteins (PPD) were 
fractionated by reverse phase liquid chromatography (RP-HPLC) using a CI 8 column (7.8 x 

10 300 mM; Waters, Milford, MA) in a Biocad HPLC system (Perseptive Biosystems, 
Framingham, MA). Fractions were eluted from the column with a linear gradient from 0- 
100% buffer (0.1% TFA in acetonitrile). The flow rate was 10 ml/minute and eluent was 
monitored at 214 nm and 280 nm. 

Six fractions were collected, dried, suspended in PBS and tested individually 

15 in M tuberculosis-infected guinea pigs for induction of delayed type hypersensitivity (DTH) 
reaction. One fraction was found to induce a strong DTH reaction and was subsequently 
fractionated further by RP-HPLC on a microbore Vydac CI 8 column (Cat. No. 218TP51 15) 
in a Perkin Elmer/ Applied Biosystems Division Model 172 HPLC. Fractions were eluted 
with a linear gradient from 5-100% buffer (0.05% TFA in acetonitrile) with a flow rate of 80 

20 jil/minute. Eluent was monitored at 215 nm. Eight fractions were collected and tested for 
induction of DTH in M, tuberculosis-infected guinea pigs. One fraction was found to induce 
strong DTH of about 16 mm induration. The other fractions did not induce detectable DTH. 
The positive fraction was submitted to SDS-PAGE gel electrophoresis and found to contain a 
single protein band of approximately 12 kD molecular weight. 

25 This polypeptide, herein after referred to as DPPD, was sequenced from the 

amino terminal using a Perkin Elmer/Applied Biosystems Division Procise 492 protein 
sequencer as described above and found to have the N-terminal sequence shown in SEQ ID 
NO:: 124. Comparison of this sequence with known sequences in the gene bank as described 
above revealed no known homologies. Four cyanogen bromide fragments of DPPD were 

30 isolated and found to have the sequences shown in SEQ ID NOS: 1 25-128. 
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EXAMPLE 5 
Synthesis of Synthetic Polypeptides 

5 Polypeptides may be synthesized on a Millipore 9050 peptide synthesizer 

using FMOC chemistry with HPTU (0-Benzotriazole-N,N,N',N , -tetramethyluronium 
hexafluorophosphate) activation. A Gly-Cys-Gly sequence may be attached to the amino 
terminus of the peptide to provide a method of conjugation or labeling of the peptide. 
Cleavage of the peptides from the solid support may be carried out using the following 

10 cleavage mixture: trifluoroacetic acid:ethanedithiol:thioanisole:water:pheiiol (40:1:2:2:3). 
After cleaving for 2 hours, the peptides may be precipitated in cold methyl-t^butyl-ether. The 
peptide pellets may then be dissolved in water containing 0.1% trifluoroacetic acid (TFA) and 
lyophilized prior to purification by CI 8 reverse phase HPLC. A gradient of 0-60% 
acetonitrile (containing 0.1% TFA) in water (containing 0.1% TFA) may be used to elute the 

15 peptides. Following lyophilization of the pure fractions, the peptides may be characterized 
using electrospray mass spectrometry and by amino acid analysis. 

This procedure was used to synthesize a TbM-1 peptide that contains one and 
a half repeats of a TbM-1 sequence. The TbM-1 peptide has the sequence 
GCGDRSGGNLDQIRLRRDRSGGNL (SEQ ID NO: 63). 

20 

EXAMPLE 6 

T Tsf. of Representative Antigens for Serodiagnosis of Tuberculosis 

25 This Example illustrates the diagnostic properties of several representative 

antigens. 

Assays were performed in 96-well plates were coated with 200 ng antigen 
diluted to 50 uL in carbonate coating buffer, pH 9.6. The wells were coated overnight at 4°C 
(or 2 hours at 37°C). The plate contents were then removed and the wells were blocked for 2 
30 hours with 200 ^L of PBS/1% BSA. After the blocking step, the wells were washed five 
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times with PBS/0.1% Tween 20™. 50 uL sera, diluted 1 :100 in PBS/0.1% Tween 20™/0.1% 
BSA, was then added to each well and incubated for 30 minutes at room temperature. The 
plates were then washed again five times with PBS/0.1% Tween 20™. 

The enzyme conjugate (horseradish peroxidase - Protein A, Zymed, San 

5 Francisco, CA) was then diluted 1 : 1 0,000 in PBS/0. 1 % Tween 20™/0. 1 % BSA, and 50 ^L of 
the diluted conjugate was added to each well and incubated for 30 minutes at room 
temperature. Following incubation, the wells were washed five times with PBS/0.1% Tween 
20™. 100 uL of tetramethylbenzidine peroxidase (TMB) substrate (Kirkegaard and Perry 
Laboratories, Gaithersburg, MD) was added, undiluted, and incubated for about 15 minutes. 

10 The reaction was stopped with the addition of 100 uL of 1 N H 2 S0 4 to each well, and the 
plates were read at 450 nm. 

Figure 4 shows the ELISA reactivity of two recombinant antigens isolated 
using method A in Example 3 (TbRa3 and TbRa9) with sera from M. tuberculosis positive 
and negative patients. The reactivity of these antigens is compared to that of bacterial lysate 

15 isolated from M. tuberculosis strain H37Ra (Difco, Detroit, MI). In both cases, the 
recombinant antigens differentiated positive from negative sera. Based on cut-off values 
obtained from receiver-operator curves, TbRa3 detected 56 out of 87 positive sera, and 
TbRa9 detected 1 1 1 out of 1 65 positive sera. 

Figure 5 illustrates the ELISA reactivity of representative antigens isolated 

20 using method B of Example 3. The reactivity of the recombinant antigens TbH4, TbH12, 
Tb38-1 and the peptide TbM-1 (as described in Example 4) is compared to that of the 38 kD 
antigen described by Andersen and Hansen, Infect. Immun. 57:2481-2488, 1989. Again, all 
of the polypeptides tested differentiated positive from negative sera. Based on cut-off values 
obtained from receiver-operator curves, TbH4 detected 67 out of 126 positive sera, TbH12 

25 detected 50 out of 125 positive sera, 38-1 detected 61 out of 101 positive sera and the TbM-1 
peptide detected 25 out of 30 positive sera. 

The reactivity of four antigens (TbRa3, TbRa9, TbH4 and TbH12) with sera 
from a group of M. tuberculosis infected patients with differing reactivity in the acid fast stain 
of sputum (Smithwick and David, Tubercle 52:226, 1971) was also examined, and compared 
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to the reactivity of M tuberculosis lysate and the 38 kD antigen. The results are presented 
Table 3, below: 

TABLE 3 

Reactivity of Antigens with Sera from M. tuberculosis Patients 



Patient 


Acid 
Fast 

Sputum 


ELISA Values 


Lysate 38kD TbRa9 TbH12 


TbH4 


TbRa3 


Tb01B93I-2 


MM 


1 RSI 


0 634 


0.998 


1.022 


1.030 


1.314 


Tb01B93I-19 


MM 


Z.UJ / 


2 322 


0.608 


0.837 


1.857 / 


2.335 


Tb01B93I-8 


+++ 


2 703 


0.527 


0.492 


0.281 


0.501 


2.002 


Tb01B93M0 


+++ 


1 665 


1.301 


0.685 


0.216 


0.448 


0.458 


Tb01B93Ml 


+++ 


2.817 


0.697 


0.509 


0.301 


0.173 


2.608 


Tb01B93M5 


+++ 


1.28 


0.283 


0.808 


0.218 


1.537 


0.811 


Tb01B93I-16 


+++ 


2.908 


>3 


0.899 


0.441 


0.593 


1.080 


Tb01B93I-25 


+++ 


0.395 


0.131 


0.335 


0.211 


0.107 


0.948 


Tb01B93I-87 


+++ 


2.653 


2.432 


2.282 


0.977 


1.221 


0.857 


Tb01B93I-89 


+++ 


1.912 


2.370 


2.436 


0.876 


0.520 


0.952 


Tb01B94I-108 


+++ 


1.639 


0.341 


0.797 


0.368 


0.654 


0.798 


Tb01B94I-201 


+++ 


1.721 


0.419 


0.661 


0.137 


0.064 


0.692 


Tb01B93I-88 


++ 


1.939 


1.269 


2.519 


1.381 


0.214 


0.530 


Tb01B93I-92 


++ 


2.355 


2.329 


2.78 


0.685 


0.997 


2.527 


Tb01B94I-109 


++ 


0.993 


0.620 


0.574 


0.441 


0.5 


2.558 


Tb01B94I-210 


++ 


2.777 


>3 


0.393 


0.367 


1.004 


1.315 


Tb01B94I-224 


++ 


2.913 


0.476 


0.251 


1.297 


1.990 


0.256 
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Acid 
Fast 


ELISA Values 


Patient 


Sputum 


Lysate 


38kD 


TbRa9 


TbH12 


TbH4 


TbRa3 ! 


Tb01B93I-9 


+ 


2.649 


0.278 


0.210 


0.140 


0.181 


1.586 


Tb01B931-14 


+ 


>3 


1.538 


0.282 


0.291 


0.549 


2.880 


Tb01B93I-21 


+ 


2.645 


0.739 


2.499 


0.783 


0.536 


1.770 


Tb01B93I-22 


+ 


0.714 


0.451 


2.082 


0.285 


0.269 


1.159 


Tb01B93I-31 


+ 


0.956 


0.490 


1.019 


0.812 


0.176 


1.293 


Tb01B93I-32 


- 


2 261 


0.786 


0.668 


0.273 


0.535 


0.405 


Tb01B93I-52 


- 


v.UJO 


0.1 14 


0.434 


0.330 


0.273 


1.140 


Tb01B93I-99 


- 


2 118 


0.584 


1.62 


0.119 


0.977 


0.729 


Tb01B94I-130 


— 


1 349 


0.224 


0.86 


0.282 


0.383 


2.146 


Tb01B94M31 


- 


0.685 


0.324 


1.173 


0.059 


0.118 


1.431 


AT4-0070 


Normal 


0.072 


0.043 


0.092 


0.071 


0.040 


0.039 


AT4-0105 


Normal 


0.397 


0.121 


0.118 


0.103 


0.078 


0.390 


3/15/94-1 


Normal 


0.227 


0.064 


0.098 


0.026 


0.001 


0.228 


4/15/93-2 


Normal 


0.114 


0.240 


0.071 


0.034 


0.041 


0.264 


5/26/94-4 


Normal 


0.089 


0.259 


0.096 


0.046 


0.008 


0.053 


5/26/94-3 


Normal 


0.139 


0.093 


0.085 


0.019 


0.067 


0.01 



Based on cut-off values obtained from receiver-operator curves, TbRa3 
detected 23 out of 27 positive sera, TbRa9 detected 22 out of 27, TbH4 detected 18 out of 27 
and TbH12 detected 15 out of 27. If used in combination, these four antigens would have a 
5 theoretical sensitivity of 27 out of 27, indicating that these antigens should complement each 
other in the serological detection of M. tuberculosis infection. In addition, several of the 
recombinant antigens detected positive sera that were not detected using the 38 kD antigen, 
indicating that these antigens may be complementary to the 38 kD antigen. 
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The reactivity of the recombinant antigen TbRall with sera from 
M tuberculosis patients shown to be negative for the 38 kD antigen, as well as with sera from 
PPD positive and normal donors, was determined by ELISA as described above. The results 
are shown in Figure 6 which indicates that TbRal 1, while being negative with sera from PPD 
5 positive and normal donors, detected sera that were negative with the 38 kD antigen. Of the 
thirteen 38 kD negative sera tested, nine were positive with TbRall, indicating that this 
antigen may be reacting with a sub-group of 38 kD antigen negative sera. In contrast, in a 
group of 38 kD positive sera where TbRal 1 was reactive, the mean OD 450 for TbRal 1 was 
lower than that for the 38 kD antigen. The data indicate an inverse relationship between the 

1 0 presence of TbRal 1 activity and 38 kD positivity. 

The antigen TbRa2A was tested in an indirect ELISA using initially 50 ^1 of 
serum at 1:100 dilution for 30 minutes at room temperature followed by washing in PBS 
Tween and incubating for 30 minutes with biotinylated Protein A (Zymed, San Francisco, 
CA) at a 1:10,000 dilution. Following washing, 50 of streptavidin-horseradish peroxidase 

15 (Zymed) at 1:10,000 dilution was added and the mixture incubated for 30 minutes. After 
washing, the assay was developed with TMB substrate as described above. The reactivity of 
TbRa2A with sera from M. tuberculosis patients and normal donors in shown in Table 4. The 
mean value for reactivity of TbRa2A with sera from M. tuberculosis patients was 0.444 with 
a standard deviation of 0.309. The mean for reactivity with sera from normal donors was 

20 0.109 with a standard deviation of 0.029. Testing of 38 kD negative sera (Figure 7) also 
indicated that the TbRa2A antigen was capable of detecting sera in this category. 

TABLE 4 

Reactivity of TbRa2A with sera from M. tuberculosisPatiewts and from Normal 
25 Donors 



Serum ID 


Status 


OD450 


Tb85 


TB 


0.680 j 


Tb86 


TB 


0.450 


j Tb87 


TB 


0.263 


Tb88 


TB 


0.275 


Tb89 


TB 


0.403 
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Tb91 


TB 


0393 1 


Tb92 


TB 


0.401 | 


Tb93 


TB 


0.232 


Tb94 


TB 


0.333 


Tb95 


TB 


0.435 


Tb96 


TB 


0.284 


Th97 


TB 


0.320 




TR 

X u 


0.328 


TH1 on 


TR 

X XJ 


0.817 


Thiol 


TR 


0 607 


1 D 1 UZ 


TR 

X XJ 


0 191 1 


1 DIUj 


TVK 
I o 


0 228 


lull// 


TR 

I XJ 


0 324 


TK1 no 


TR 


1 S72 


1 Dl 1 Z 


TT) 
1 J3 


0 338 


FIT A H1 


lNOHTldl 


0 036 




iNormai 


0 126 


/\ 1 *f-UU*f *f 


lNormaj 


0 130 


AT4-0052 


Normal 


0.135 


AT4-0053 


Normal 


0.133 


AT4-0062 


Normal 


0.128 


AT4-0070 


Normal 


0.088 


1 AT4-0091 


Normal 


0.108 


AT4-0100 


Normal 


0.106 


i AT4-0105 


Normal 


0.108 


AT4-0109 


Normal 


0.105 



The reactivity of the recombinant antigen (g) (SEQ ID NO: 60) with sera from 
M. tuberculosis patients and normal donors was determined by ELISA as described above. 
Figure 8 shows the results of the titration of antigen (g) with four M tuberculosis positive 
5 sera that were all reactive with the 38 kD antigen and with four donor sera. All four positive 
sera were reactive with antigen (g). 

The reactivity of the recombinant antigen TbH-29 (SEQ ID NO: 137) with 
sera from M. tuberculosis patients, PPD positive donors and normal donors was determined 
by indirect ELISA as described above. The results are shown in Figure 9. TbH-29 detected 
10 30 out of 60 M. tuberculosis sera, 2 out of 8 PPD positive sera and 2 out of 27 normal sera. 

Figure 10 shows the results of ELISA tests (both direct and indirect) of the 
antigen TbH-33 (SEQ ID NO: 140) with sera from M. tuberculosis patients and from normal 



BNSDOCID: < WO 98 1 6645 A2 J_> 



WO 98/16645 



PCT/US97/18214 



51 

donors and with a pool of sera from M tuberculosis patients. The mean OD 450 was 
demonstrated to be higher with sera from M. tuberculosis patients than from normal donors, 
with the mean OD 450 being significantly higher in the indirect ELISA than in the direct 
ELISA. Figure 11 is a titration curve for the reactivity of recombinant TbH-33 with sera 
5 from M. tuberculosis patients and from normal donors showing an increase in OD 450 with 
increasing concentration of antigen. 

The reactivity of the recombinant antigens RDIF6, RDIF8 and RDIF10 (SEQ 
ID NOS: 184-187, respectively) with sera from M tuberculosis patients and normal donors 
was determined by ELISA as described above. RDIF6 detected 6 out of 32 M. tuberculosis 
10 sera and 0 out of 15 normal sera; RDIF8 detected 14 out of 32 M. tuberculosis sera and 0 out 
of 15 normal sera; and RDIF10 detected 4 out of 27 M. tuberculosis sera/ and 1 out of 15 
normal sera. In addition, RDIF10 was found to detect 0 out of 5 sera from PPD-positive 
donors. 

15 EXAMPLE 7 

Preparation and Characterization of M Tuberculosis Fusion Proteins 

A fusion protein containing TbRa3, the 38 kD antigen and Tb38-1 was 
prepared as follows. 

20 Each of the DNA constructs TbRa3, 38 kD and Tb38-1 were modified by PCR 

in order to facilitate their fusion and the subsequent expression of the fusion protein TbRa3- 
38 kD-Tb38-l. TbRa3, 38 kD and Tb38-1 DNA was used to perform PCR using the primers 
PDM-64 and PDM-65 (SEQ ID NO: 141 and 142), PDM-57 and PDM-58 (SEQ ID NO: 143 
and 144), and PDM-69 and PDM-60 (SEQ ID NO: 145-146), respectively. In each case, the 

25 DNA amplification was performed using 10 10X Pfu buffer, 2 \i\ 10 mM dNTPs, 2 \x\ each 
of the PCR primers at 10 jaM concentration, 81.5 \i\ water, 1.5 [il Pfu DNA polymerase 
(Stratagene, La Jolla, CA) and 1 (j.1 DNA at either 70 ng/\xl (for TbRa3) or 50 ng/jal (for 38 
kD and Tb38-1). For TbRa3, denaturation at 94°C was performed for 2 min, followed by 40 
cycles of 96°C for 15 sec and 72°C for 1 min, and lastly by 72°C for 4 min. For 38 kD, 

30 denaturation at 96°C was performed for 2 min, followed by 40 cycles of 96°C for 30 sec, 
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68°C for 15 sec and 72°C for 3 mint, and finally by 72°C for 4 min. For Tb38-1 denaturation 
at 94°C for 2 min was followed by 10 cycles of 96°C for 1 5 sec, 68°C for 1 5 sec and 72°C for 
1.5 min, 30 cycles of 96°C for 15 sec, 64°C for 15 sec and 72°C for 1.5, and finally by 72°C 
for 4 min. 

5 The TbRa3 PCR fragment was digested with Ndel and EcoRI and cloned 

directly into pT7 A L2 IL 1 vector using Ndel and EcoRI sites. The 38 kD PCR fragment was 
digested with Sse8387I, treated with T4 DNA polymerase to make blunt ends and then 
digested with EcoRI for direct cloning into the pT7 A L2Ra3-l vector which was digested with 
StuI and EcoRI. The 38-1 PCR fragment was digested with Eco47III and EcoRI and directly 

10 subcloned into pT7 A L2Ra3/38kD-17 digested with the same enzymes. The whole fusion was 
then transferred to pET28b using Ndel and EcoRI sites. The fusion construct was confirmed 
by DNA sequencing. 

The expression construct was transformed to BLR pLys S E. coli (Novagen, 
Madison, WI) and grown overnight in LB broth with kanamycin (30 |ig/ml) and 

15 chloramphenicol (34 ng/ml). This culture (12 ml) was used to inoculate 500 ml 2XYT with 
the same antibiotics and the culture was induced with IPTG at an OD560 of 0.44 to a final 
concentration of 1.2 mM. Four hours post-induction, the bacteria were harvested and 
sonicated in 20 mM Tris (8.0), 100 mM NaCl, 0.1% DOC, 20 ng/ml Leupeptin, 20 mM 
PMSF followed by centrifugation at 26,000 X g. The resulting pellet was resuspended in 8 M 

20 urea, 20 mM Tris (8.0), 100 mM NaCl and bound to Pro-bond nickel resin (Invitrogen, 
Carlsbad, CA). The column was washed several times with the above buffer then eluted with 
an imidazole gradient (50 mM, 100 mM, 500 mM imidazole was added to 8 M urea, 20 mM 
Tris (8.0), 100 mM NaCl). The eluates containing the protein of interest were then dialzyed 
against 1 0 mM Tris (8.0). 

25 The DNA and amino acid sequences for the resulting fusion protein 

(hereinafter referred to as TbRa3-38 kD-Tb38-l) are provided in SEQ ID NO: 147 and 148, 
respectively. 

A fusion protein containing the two antigens TbH-9 and Tb38-1 (hereinafter 
referred to as TbH9-Tb38-l) without a hinge sequence, was prepared using a similar 
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procedure to that described above. The DNA sequence for the TbH9-Tb38-l fusion protein is 

provided in SEQ ID NO: 151. 

A fusion protein containing TbRa3, the antigen 38kD, Tb38-1 and DPEP was 

prepared as follows. 

5 Each of the DNA constructs TbRa3, 38 kD and Tb38-1 were modified by PCR 

and cloned into vectors essentially as described above, with the primers PDM-69 (SEQ ID 
NO:145 and PDM-83 (SEQ ID NO: 200) being used for amplification of the Tb38-1A 
fragment. Tb38-1A differs from Tb38-1 by a Dral site at the 3' end of the coding region that 
keeps the final amino acid intact while creating a blunt restriction site that is in frame. The 

1 0 TbRa3/3 8kD/Tb3 8- 1 A fusion was then transferred to pET28b using Ndel and EcoRl sites. 

DPEP DNA was used to perform PCR using the primers PDM-84 and PDM- 
85 (SEQ ID NO: 201 and 202, respectively) and 1 ul DNA at 50 ng/ul. Denaturation at 94 °C 
was performed for 2 min, followed by 10 cycles of 96 °C for 15 sec, 68 °C for 15 sec and 72 
°C for 1.5 min; 30 cycles of 96 °C for 15 sec, 64 °C for 15 sec and 72 °C for 1.5 min; and 

15 finally by 72 °C for 4 min. The DPEP PCR fragment was digested with EcoRI and Eco72I 
and clones directly into the P ET28Ra3/38kD/38-lA construct which was digested with Dral 
and EcoRI. The fusion construct was confirmed to be correct by DNA sequencing. 
Recombinant protein was prepared as described above. The DNA and amino acid sequences 
for the resulting fusion protein (hereinafter referred to as TbF-2) are provided in SEQ ID NO: 

20 203 and 204, respectively. 

EXAMPLE 8 
T Jse of M Tuberculosis Fusion Proteins for 
Serodiagnosis of Tuberculosis 

25 

The effectiveness of the fusion protein TbRa3-38 kD-Tb38-l, prepared as 
described above, in the serodiagnosis of tuberculosis infection was examined by ELISA. 

The ELISA protocol was as described above in Example 6, with the fusion 
protein being coated at 200 ng/well. A panel of sera was chosen from a group of tuberculosis 
30 patients previously shown, either by ELISA or by western blot analysis, to react with each of 
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the three antigens individually or in combination. Such a panel enabled the dissection of the 
serological reactivity of the fusion protein to determine if al, three epitopes fimctioned with 
the fu S1 on protein. As shown in Table 5, all four sera that reacted with TbRa3 only were 
detectable with the fusion protein. Three sera that reacted only with Tb38-1 were also 
detectable, as were two sear that reacted with 38 kD alone. The remaining 15 sera were all 
pos^ve with the fusion protein based on a cut-off in the assay of mean negatives + 3 standard 
demons. This data demonstrates the functional activity of all three epitopes in the fusion 
protein. 
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289004 


TB 


- 


- 


+ 


0.848 


+ 


308004 


TB 


- 


+ 




3.338 


+ 


314004 


TB 


- 


+ 


- 


1.362 


+ 


317004 


TB 


+ 


- 


- 


0.763 


+ 


312004 


TB 


- 


- 


+ 


1.079 


+ 


D176 


PPD 




- 


- 


0.145 


- 


D162 


PPD 


_ 


- 


- 


0.073 


- 


D161 


PPD 


_ 


- 


- 


0.097 


- 


D27 


PPD 


- 


- 


- 


0.082 


- 


A6-124 


NORMAL 






- 


0.053 


- 


A6-125 


NORMAL 


_ 


_ 


- 


0.087 / 


- 


A6-126 


NORMAL 




_ 


- 


0.346/ 


± 


A6-127 


NORMAL 


_ 




- 


0.064 


- 


A6-128 


NORMAL 


_ 


_ 


_ 


0.034 


- 


A6-129 


NORMAL 




_ 


- 


0.037 


- 


A6-130 


NORMAL 


. _ . 


_ 




0;057 


- 


A6-131 


NORMAL 




_ 




0.054 


- 


A6-132 


NORMAL 


_ 






0.022 


- 


A6-133 


NORMAL 




_ 




0.147 


- 


A6-134 


NORMAL 


_ 


_ 


- 


0.101 


- 


A6-135 


NORMAL 


_ 


_ 




0.066 


- 


A6-136 


NORMAL 


- 


- 




0.054 


- 


A6-137 


NORMAL 








0.065 




A6-138 


NORMAL 








0.041 




A6-139 


NORMAL 








0.103 




A6-140 


NORMAL 








0.212 




A6-141 


NORMAL 








0.056 




A6-142 


NORMAL 








0.051 





The reactivity of the fusion protein TbF-2 with sera from M tuberculosis- 
infected patients was examined by ELISA using the protocol described above. The results of 
these studies (Table 6) demonstrate that all four antigens function independently in the fusion 
5 protein. 
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Table 6 

Reactivity of TbF-2 Fusion Protein with TB and Normal Sera 



oerum iu 




TbF 

1 UI 

OD450 


Status 


TbF-2 
OD450 


Status 


ELISA R< 


^activity 
















J O KU 


ThRa3 


Tb38-1 


DPEP 


DQO 1 _Af\ 


TD 
1 D 


0 ^7 


+ 


0.321 


+ 




4- 




+ 


rJ93 1 -4 1 


TD 

1 Jo 


\J.\J\J i 


4 


0 396 

VS. -J 7U 


+ 


4 


+ 


+ 




noi i i no 


TD. 
1 D 


0 404 


4 


0.404 


4- 


4. 




± 




B931-13Z 


1 13 


1 .JUZ 


_i_ 


1 909 




T 


1 


+ 


± 


5004 


'I'D 


1 .OUD 


4 


1 666 


+ 


4. 


4. 






15004 


TB 


9 Rfi9 
Z.oOZ 


4 


9 4fiR 

Z.HUO 


+ 


+ 


T 


+ 




39004 


1 ts 


9 443 


4 


1 799 


+ 


+ 


J_ 
T 


+ 




68004 


TD 

1 B 


Z.O / I 


-4- 
T 


9 S7S i 

Z.J / J 


4. 


+ 


1 

T 


+ 




99004 




n i;oi 

u.ov i 


_i_ 
T 


n Q7i 1 
\j.y i 1 






± 


+ 




107004 


TB 


U.o / j 


t" 


n 739 

U. / jZ 


4. 




± 


4- 




92004 


TB 


1 .03/ 


1. 


1 104 


4 


+ 


X 


x 




97004 


TB 


1 ,4y l 




1 Q7Q 

1 .y iy 


4. 




1 

x 




+ 


1 1 8004 


TD 

TB 


3. 1 oz 


T 


3 f!4^ 


+ 


+ 


_J_ 
X 






173004 


TB 


3.044 


T 


J.D /o 


4. 


+ 


_1_ 
T 






175004 


TB 


i ion 
3.332 


l 
T 


9 Qlfi 
Z.7 1 0 


4 


+ 


•4. 
T 






274004 


TD 


J.OVO 


_|_ 
T 


J. / 10 


4. 




T 




+ 


n^nn>i 

276004 


TD 


1 94*5 


+ 


9 Sfi 

X.JO 


4 






+ 




«-» oi r\f\ a 

282004 


TD 
1 D 


1 94Q 
I .Z4V 


+ 


1 934 

1 .ZJH 


4- 


-4- 








289004 


1 15 


1 373 

1 mJ / J 


4. 


1 17 

1.1/ 


4- 




4- 






308004 


TD 

1 D 


3 7HR 




3.355 


+ 










3 14UU4 


TD 
I £3 


1 663 


+ 


1.399 


+ 






+ 




J 1 /UU4 


TD 
I D 


1.1 UJ 




0.92 


+ 


+ 




_ 


_ 


J 1 ZUU*f 


TD 
1 D 


1.709 




1.453 


+ 




+ 


_ 


_ 


oftnnn/i 

J SUUUh 


TR 


0.238 




0.461 


+ 




± 


_ 


+ 


4S 10,04 


TB 


0.18 




0.2 










± 


H / OUUt 


TB 


0.188 




0.469 


+ 


_ 


_ 




± 


410004 


TB 


0.384 


T 


2.392 


+ 


± 




_ 


+ 


411004 


TB 


0.306 


T 


0.874 


+ 






- 


+ 


421004 


TB 


0.357 


+ 


1.456 


+ 








T 1 


528004 


TB 


0.047 




0.196 










+ ! 


A6-87 


Normal 


0.094 




0.063 












A6-88 


Normal 


0.214 




0.19 












A6-89 


Normal 


0.248 




0.125 












A6-90 


Normal 


0.179 




0.206 












A6-91 


Normal 


0.135 




0.151 












A6-92 


Normal 


0.064 




0.097 












A6-93 


Normal 


0.072 




0.098 












A6-94 


Normal 


0.072 




0.064 












A6-95 


Normal 


0.125 




0.159 












A6-96 


Normal 


0.121 




0.12 
































Cut-off 




0.284 




0.266 
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One of skill in the art will appreciate that the order of the individual antigens 
within the fusion protein may be changed and that comparable activity would be expected 
provided each of the epitopes is still functionally available. In addition, truncated forms of 
the proteins containing active epitopes may be used in the construction of fusion proteins. 

From the foregoing, it will be appreciated that, although specific embodiments 
of the invention have been described herein for the purpose of illustration, various 
modifications may be made without deviating from the spirit and scope of the invention. 
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SEQUENCE LISTING 



(1) GENERAL INFORMATION: 

(i) APPLICANTS: Reed, Steven G. 

Skeiky, Yasir A.W. 
Dillon, Davin C. 
Campos-Neto, Antonia 
Houghton, Raymond 
Vedvick, Thomas S. 
Twardzik, Daniel R. 
Lodes, Michael J. 

(ii) TITLE OF INVENTION: COMPOUNDS AND METHODS FOR DIAGNOSIS OF 

TUBERCULOSIS 

(iii) NUMBER OF SEQUENCES: 209 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: SEED and BERRY LLP 

(B) STREET: 6300 Columbia Center, 701 Fifth Avenue 

(C) CITY: Seattle 

(D) STATE: Washington 

(E) COUNTRY: USA 

(F) ZIP: 98104-7092 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS /MS-DOS 

(D) SOFTWARE: Patentln Release #1.0, Version #1.30 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: 

(B) FILING DATE: 01-OCT-1997 

( C ) CLASS I FI CATI ON : 

(viii) ATTORNEY /AGENT INFORMATION: 

(A) NAME: Maki, David J. 

(B) REGISTRATION NUMBER: 31,392 

(C) REFERENCE / DOCKET NUMBER: 210121. 417C7 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: (206) 622-4900 

(B) TELEFAX: (206) 682-6031 



(2) INFORMATION FOR SEQ ID NO:l: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7 66 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ I D NO : 1 : 

CGAGGCACCG GTAGTTTGAA CCAAACGCAC AATCGACGGG CAAACGAACG GAAGAACACA 60 

ACCATGAAGA TGGTGAAATC GATCGCCGCA GGTCTGACCG CCGCGGCTGC AATCGGCGCC 120 

GCTGCGGCCG GTGTGACTTC GATCATGGCT GGCGGCCCGG TCGTATACCA GATGCAGCCG 180 

GTCGTCTTCG GCGCGCCACT GCCGTTGGAC CCGGCATCCG CCCCTGACGT CCCGACCGCC 24 0 

GCCCAGTTGA CCAGCCTGCT CAACAGCCTC GCCGATCCCA ACGTGTCGTT TGCGAACAAG 300 

GGCAGTCTGG TCGAGGGCGG CATCGGGGGC ACCGAGGCGC GCATCGCCGA CCACAAGCTG 360 

AAGAAGGCCG CCGAGCACGG GGATCTGCCG CTGTCGTTCA GCGTGACGAA CATCCAGCCG 4 20 

GCGGCCGCCG GTTCGGCCAC CGCCGACGTT TCCGTCTCGG GTCCGAAGCT CTCGTCGCCG 4 80 

GTCACGCAGA ACGTCACGTT CGTGAATCAA GGCGGCTGGA TGCTGTCACG CGCATCGGCG 54 0 

ATGGAGTTGC TGCAGGCCGC AGGGNAACTG ATTGGCGGGC CGGNTTCAGC CCGCTGTTCA 600 

GCTACGCCGC CCGCCTGGTG ACGCGTCCAT GTCGAACACT CGCGCGTGTA GCACGGTGCG 660 

GTNTGCGCAG GGNCGCACGC ACCGCCCGGT GCAAGCCGTC CTCGAGATAG GTGGTGNCTC 720 

GNCACCAGNG ANCACCCCCN NNTCGNCNNT TCTCGNTGNT GNATGA 7 66 
(2) INFORMATION FOR SEQ ID NO : 2 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 752 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

ATGCATCACC ATCACCATCA CGATGAAGTC ACGGTAGAGA CGACCTCCGT CTTCCGCGCA 60 

GACTTCCTCA GCGAGCTGGA CGCTCCTGCG CAAGCGGGTA CGGAGAGCGC GGTCTCCGGG 120 

GTGGAAGGGC TCCCGCCGGG CTCGGCGTTG CTGGTAGTCA AACGAGGCCC CAACGCCGGG 180 

TCCCGGTTCC TACTCGACCA AGCCATCACG TCGGCTGGTC GGCATCCCGA CAGCGACATA 240 
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TTTCTCGACG ACGTGACCGT GAGCCGTCGC CATGCTGAAT TCCGGTTGGA AAACAACGAA 300 

TTCAATGTCG TCGATGTCGG GAGTCTCAAC GGCACCTACG TCAACCGCGA GCCCGTGGAT 360 

TCGGCGGTGC TGGCGAACGG CGACGAGGTC CAGATCGGCA AGCTCCGGTT GGTGTTCTTG 420 

ACCGGACCCA AGCAAGGCGA GGATGACGGG AGTACCGGGG GCCCGTGAGC GCACCCGATA 4 80 

GCCCCGCGCT GGCCGGGATG TCGATCGGGG CGGTCCTCCG ACCTGCTACG ACCGGATTTT 54 0 

CCCTGATGTC CACCATCTCC AAGATTCGAT TCTTGGGAGG CTTGAGGGTC NGGGTGACCC 600 

CCCCGCGGGC CTCATTCNGG GGTNTCGGCN GGTTTCACCC CNTACCNACT GCCNCCCGGN 660 

TTGCNAATTC NTTCTTCNCT GCCCNNAAAG GGACCNTTAN CTTGCCGCTN GAAANGGTNA 7 20 

TCCNGGGCCC NTCCTNGAAN CCCCNTCCCC CT 7 52 

I 

(2) INFORMATION FOR SEQ ID NO: 3: j 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 813 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

CATATGCATC ACCATCACCA TCACACTTCT AACCGCCCAG CGCGTCGGGG GCGTCGAGCA 60 

CCACGCGACA CCGGGCCCGA TCGATCTGCT AGCTTGAGTC TGGTCAGGCA TCGTCGTCAG 120 

CAGCGCGATG CCCTATGTTT GTCGTCGACT CAGATATCGC GGCAATCCAA TCTCCCGCCT 180 

GCGGCCGGCG GTGCTGCAAA CTACTCCCGG AGGAATTTCG ACGTGCGCAT CAAG AT CTTC 240 

ATGCTGGTCA CGGCTGTCGT TTTGCTCTGT TGTTCGGGTG TGGCCACGGC CGCGCCCAAG 300 

ACCTACTGCG AGGAGTTGAA AGGCACCGAT ACCGGCCAGG CGTGCCAGAT TCAAATGTCC 360 

GACCCGGCCT ACAACATCAA CATCAGCCTG CCCAGTTACT ACCCCGACCA GAAGTCGCTG 420 

GAAAATTACA TCGCCCAGAC GCGCGACAAG TTCCTCAGCG CGGCCACATC GTCCACTCCA 4 80 

CGCGAAGCCC CCTACGAATT GAATATCACC TCGGCCACAT ACCAGTCCGC GATACCGCCG 54 0 

CGTGGTACGC AGGCCGTGGT GCTCAMGGTC TACCACAACG CCGGCGGCAC GCACCCAACG 600 

ACCACGTACA AGGCCTTCGA TTGGGACCAG GCCTATCGCA AGCCAATCAC CTATGACACG 660 

CTGTGGCAGG CTGACACCGA TCCGCTGCCA GTCGTCTTCC CCATTGTTGC AAGGTGAACT 720 
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GAGCAACGCA GACCGGGACA ACWGGTATCG ATAGCCGCCN AATGCCGGCT TGGAACCCNG 780 
TGAAATTATC ACAACTTCGC AGTCACNAAA NAA 813 
(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 447 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: // 



CGGTATGAAC 


ACGGCCGCGT 


CCGATAACTT 


CCAGCTGTCC 


CAGGGTGGGC 


AGGGATTCGC 


60 


CATTCCGATC 


GGGCAGGCGA 


TGGCGATCGC 


GGGCCAGATC 


CGATCGGGTG 


GGGGGTCACC 


120 


CACCGTTCAT 


ATCGGGCCTA 


CCGCCTTCCT 


CGGCTTGGGT 


GTTGTCGACA 


ACAACGGCAA 


180 


CGGCGCACGA 


GTCCAACGCG 


TGGTCGGGAG 


CGCTCCGGCG 


GCAAGTCTCG 


GCATCTCCAC 


240 


CGGCGACGTG 


ATCACCGCGG 


TCGACGGCGC 


TCCGATCAAC 


TCGGCCACCG 


CGATGGCGGA 


300 


CGCGCTTAAC 


GGGCATCATC 


CCGGTGACGT 


CATCTCGGTG 


AACTGGCAAA 


CCAAGTCGGG 


360 


CGGCACGCGT 


ACAGGGAACG 


TGACATTGGC 


CGAGGGACCC 


CCGGCCTGAT 


TTCGTCGYGG 


420 


ATACCACCCG 


CCGGCCGGCC 


AATTGGA 








447 



(2) INFORMATION FOR SEQ ID NO: 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 604 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 
GTCCCACTGC GGTCGCCGAG TATGTCGCCC AGCAAATGTC TGGCAGCCGC CCAACGGAAT 
CCGGTGATCC GACGTCGCAG GTTGTCGAAC CCGCCGCCGC GGAAGTATCG GTCCATGCCT 
AGCCCGGCGA CGGCGAGCGC CGGAATGGCG CGAGTGAGGA GGCGGGCAAT TTGGCGGGGC 



60 
120 
180 
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CCGGCGACGG 


NGAGCGCCGG 


AATGGCGCGA GTGAGGAGGT 


GGNCAGTCAT 


GCCCAGNGTG 


240 


ATCCAATCAA 


CCTGNATTCG 


GNCTGNGGGN CCATTTGACA ATCGAGGTAG 


TGAGCGCAAA 


300 


TGAATGATGG 


AAAACGGGNG 


GNGACGTCCG NTGTTCTGGT 


GGTGNTAGGT 


GNCTGNCTGG 


360 


NGTNGNGGNT 


ATCAGGATGT 


TCTTCGNCGA AANCTGATGN 


CGAGGAACAG 


GGTGTNCCCG 


420 


NNANNCCNAN 


GGNGTCCNAN 


CCCNNNNTCC TCGNCGANAT 


CANANAGNCG 


NTTGATGNGA 


480 


NAAAAGGGTG 


GANCAGNNNN AANTNGNGGN CCNAANAANC 


NNNANNGNNG 


NNAGNTNGNT 


540 


NNNTNTTNNC ANNNNNNNTG 


NNGNNGNNCN NNNCAANCNN 


NTNNNNGNAA 


NNGGNTTNTT 


600 


NAAT 










604 



(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 633 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 
TTGCANGTCG AACCACCTCA CTAAAGGGAA CAAAAGCTNG AGCTCCACCG CGGTGGCGGC 
CGCTCTAGAA CTAGTGKATM YYYCKGGCTG CAGSAATYCG GYACGAGCAT TAGGACAGTC 
TAACGGTCCT GTTACGGTGA TCGAATGACC GACGACATCC TGCTGATCGA CACCGACGAA 
CGGGTGCGAA CCCTCACCCT CAACCGGCCG CAGTCCCGYA ACGCGCTCTC GGCGGCGCTA 
CGGGATCGGT TTTTCGCGGY GTTGGYCGAC GCCGAGGYCG ACGACGACAT CGACGTCGTC 
ATCCTCACCG GYGCCGATCC GGTGTTCTGC GCCGGACTGG ACCTCAAGGT AGCTGGCCGG 
GCAGACCGCG CTGCCGGACA TCTCACCGCG GTGGGCGGCC ATGACCAAGC CGGTGATCGG 
CGCGATCAAC GGCGCCGCGG TCACCGGCGG GCTCGAACTG GCGCTGTACT GCGACATCCT 
GATCGCCTCC GAGCACGCCC GCTTCGNCGA CACCCACGCC CGGGTGGGGC TGCTGCCCAC 
CTGGGGACTC AGTGTGTGCT TGCCGCAAAA GGTCGGCATC GGNCTGGGCC GGTGGATGAG 
CCTGACCGGC GACTACCTGT CCGTGACCGA CGC 
(2) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
633 
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(A) LENGTH: 1362 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
{ D ) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 

CGACGACGAC GGCGCCGGAG AGCGGGCGCG AACGGCGATC GACGCGGCCC TGGCCAGAGT 60 

CGGCACCACC CAGGAGGGAG TCGAATCATG AAATTTGTCA ACCATATTGA GCCCGTCGCG 120 

CCCCGCCGAG CCGGCGGCGC GGTCGCCGAG GTCTATGCCG AGGCCCGCCG CGAGTTCGGC 180 

CGGCTGCCCG AGCCGCTCGC CATGCTGTCC CCGGACGAGG GACTGCTCAC CGCCGGCTGG 24 0 

GCGACGTTGC GCGAGACACT GCTGGTGGGC CAGGTGCCGC GTGGCCGCAA GGAAGCCGTC 300 

GCCGCCGCCG TCGCGGCCAG CCTGCGCTGC CCCTGGTGCG TCGACGCACA CACCACCATG 360 

CTGTACGCGG C AG GCCAAAC CGACACCGCC GCGGCGATCT TGGCCGGCAC AGCACCTGCC 4 20 

GCCGGTGACC CGAACGCGCC GTATGTGGCG TGGGCGGCAG GAACCGGGAC ACCGGCGGGA 4 80 

CCGCCGGCAC CGTTCGGCCC GGATGTCGCC GCCGAATACC TGGGCACCGC GGTGCAATTC 54 0 

CACTTCATCG CACGCCTGGT CCTGGTGCTG C T GG AC G AAA CCTTCCTGCC GGGGGGCCCG 600 

CGCGCCCAAC AGCTCATGCG CCGCGCCGGT GGACTGGTGT TCGCCCGCAA GGTGCGCGCG 660 

GAGCATCGGC CGGGCCGCTC CACCCGCCGG CTCGAGCCGC GAACGCTGCC CGACGATCTG 720 

GCATGGGCAA CACCGTCCGA GCCCATAGCA ACCGCGTTCG CCGCGCTCAG CCACCACCTG 7 80 

GACACCGCGC CGCACCTGCC GCCACCGACT CGTCAGGTGG TCAGGCGGGT CGTGGGGTCG 84 0 

TGGCACGGCG AGCCAATGCC GATGAGCAGT CGCTGGACGA AC GAG C AC AC CGCCGAGCTG 900 

CCCGCCGACC TGCACGCGCC CACCCGTCTT GCCCTGCTGA CCGGCCTGGC CCCGCATCAG 960 

GTGACCGACG ACGACGTCGC CGCGGCCCGA TCCCTGCTCG AC ACCG AT G C GGCGCTGGTT 1020 

GGCGCCCTGG CCTGGGCCGC CTTCACCGCC GCGCGGCGCA TCGGCACCTG GATCGGCGCC 1080 

GCCGCCGAGG GCCAGGTGTC GCGGCAAAAC CCGACTGGGT GAGTGTGCGC GCCCTGTCGG 114 0 

TAGGGTGTCA TCGCTGGCCC GAGGGATCTC GCGGCGGCGA ACGGAGGTGG CGACACAGGT 1200 

GGAAGCTGCG CCCACTGGCT TGCGCCCCAA CGCCGTCGTG GGCGTTCGGT TGGCCGCACT 1260 

GGCCGATCAG GTCGGCGCCG GCCCTTGGCC GAAGGTCCAG CTCAACGTGC CGTCACCGAA 1320 
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GGACCGGACG GTCACCGGGG GTCACCCTGC GCGCCCAAGG AA 1362 
(2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1458 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

( D ) TOPOLOGY : linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 

GCGACGACCC CGATATGCCG GGCACCGTAG CGAAAGCCGT CGCCGACGCA CTCGGGCGCG 60 

/ 

GTATCGCTCC CGTTGAGGAC ATTCAGGACT GCGTGGAGGC CCGGCTGGGG GAAGCC^GTC 120 

TGGATGACGT GGCCCGTGTT TACATCATCT ACCGGCAGCG GCGCGCCGAG CTGCGGACGG 18 0 

CTAAGGCCTT GCTCGGCGTG CGGGACGAGT TAAAGCTGAG CTTGGCGGCC GTGACGGTAC 24 0 

TGCGCGAGCG CTATCTGCTG CACGACGAGC AGGGCCGGCC GGCCGAGTCG ACCGGCGAGC 300 

TGATGGACCG ATCGGCGCGC TGTGTCGCGG CGGCCGAGGA CCAGTATGAG CCGGGCTCGT 360 

CGAGGCGGTG GGCCGAGCGG TTCGCCACGC TATTACGCAA CCTGGAATTC CTGCCGAATT 4 20 

CGCCCACGTT GATGAACTCT GGCACCGACC TGGGACTGCT CGCCGGCTGT TTTGTTCTGC 4 80 

CGATTGAGGA TTCGCTGCAA TCGATCTTTG CGACGCTGGG ACAGGCCGCC GAGCTGCAGC 54 0 

GGGCTGGAGG CGGCACCGGA TATGCGTTCA GCCACCTGCG ACCCGCCGGG GATCGGGTGG 600 

CCTCCACGGG CGGCACGGCC AGCGGACCGG TGTCGTTTCT ACGGCTGTAT GACAGTGCCG 660 

CGGGTGTGGT CTCCATGGGC GGTCGCCGGC GTGGCGCCTG TATGGCTGTG CTTGATGTGT 7 20 

CGCACCCGGA TATCTGTGAT TTCGTCACCG CCAAGGCCGA ATCCCCCAGC GAGCTCCCGC 7 80 

ATTTCAACCT ATCGGTTGGT GTGACCGACG CGTTCCTGCG GGCCGTCGAA CGCAACGGCC 84 0 

TACACCGGCT GGTCAATCCG CGAACCGGCA AGATCGTCGC GCGGATGCCC GCCGCCGAGC 900 

TGTTCGACGC CATCTGCAAA GCCGCGCACG CCGGTGGCGA TCCCGGGCTG GTGTTTCTCG 960 

ACACGATCAA TAGGGCAAAC CCGGTGCCGG GGAGAGGCCG CATC GAG GCG ACCAACCCGT 1020 

GCGGGGAGGT CCCACTGCTG CCTTACGAGT CATGTAATCT CGGCTCGATC AACCTCGCCC 108 0 

GGATGCTCGC CGACGGTCGC GTCGACTGGG ACCGGCTCGA GGAGGTCGCC GGTGTGGCGG 114 0 

TGCGGTTCCT TGATGACGTC ATCGATGTCA GCCGCTACCC CTTCCCCGAA CTGGGTGAGG 1200 
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CGGCCCGCGC CACCCGCAAG ATCGGGCTGG GAGTCATGGG TTTGGCGGAA CTGCTTGCCG 12 60 

CACTGGGTAT TCCGTACGAC AGTGAAGAAG CCGTGCGGTT AGCCACCCGG CTCATGCGTC 1320 

GCATACAGCA GGCGGCGCAC ACGGCATCGC GGAGGCTGGC CGAAGAGCGG GGCGCATTCC 1380 

CGGCGTTCAC CGATAGCCGG TTCGCGCGGT CGGGCCCGAG GCGCAACGCA CAGGTCACCT 14 40 

CCGTCGCTCC GACGGGCA 1458 
(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 862 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single /;' 

(D) . TOPOLOGY: linear // 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 
ACGGTGTAAT CGTGCTGGAT CTGGAACCGC GTGGCCCGCT ACCTACCGAG ATCTACTGGC 60 

GGCGCAGGGG GCTGGCCCTG GGCATCGCGG TCGTCGTAGT CGGGATCGCG GTGGCCATCG 120 

TCATCGCCTT CGTCGACAGC AGCGCCGGTG CCAAACCGGT CAGCGCCGAC AAGCCGGCCT 180 

CCGCCCAGAG CCATCCGGGC TCGCCGGCAC CCCAAGCACC CCAGCCGGCC GGGCAAACCG 24 0 

AAGGTAACGC CGCCGCGGCC CCGCCGCAGG GCCAAAACCC CGAGACACCC ACGCCCACCG 300 

CCGCGGTGCA GCCGCCGCCG GTGCTCAAGG AAGGGGACGA TTGCCCCGAT TCGACGCTGG 3 60 

CCGTCAAAGG TTTGACCAAC GCGCCGCAGT ACTACGTCGG CGACCAGCCG AAGTTCACCA 4 20 

TGGTGGTCAC CAACATCGGC CTGGTGTCCT GTAAACGCGA CGTTGGGGCC GCGGTGTTGG 4 80 

CCGCCTACGT TTACTCGCTG GACAACAAGC GGTTGTGGTC CAACCTGGAC TGCGCGCCCT 54 0 

CGAATGAGAC GCTGGTCAAG ACGTTTTCCC CCGGTGAGCA GGTAACGACC GCGGTGACCT 600 

GGACCGGGAT GGGATCGGCG CCGCGCTGCC CATTGCCGCG GCCGGCGATC GGGCCGGGCA 660 

CCTACAATCT CGTGGTACAA CTGGGCAATC TGCGCTCGCT GCCGGTTCCG TTCATCCTGA 720 

ATCAGCCGCC GCCGCCGCCC GGGCCGGTAC CCGCTCCGGG TCCAGCGCAG GCGCCTCCGC 780 

CGGAGTCTCC CGCGCAAGGC GGATAATTAT TGATCGCTGA TGGTCGATTC CGCCAGCTGT 84 0 

GACAACCCCT CGCCTCGTGC CG 862 
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(2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 622 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 



TTGATCAGCA 


CCGGCAAGGC 


GTCACATGCC 


TCCCTGGGTG 


TGCAGGTGAC 


CAATGACAAA 


60 


GACACCCCGG 


GCGCCAAGAT 


CGTCGAAGTA 


GTGGCCGGTG 


GTGCTGCCGC 


GAACGCTGGA 


120 


GTGCCGAAGG 


GCGTCGTTGT 


CACCAAGGTC 


GACGACCGCC 


CGATCAACAG 


CGCGGACGCG 


180 


TTGGTTGCCG 


CCGTGCGGTC 


CAAAGCGCCG 


GGCGCCACGG 


TGGCGCTAAC 


CTTTCAGGAT 


240 


CCCTCGGGCG 


GTAGCCGCAC 


AGTGCAAGTC 


ACCCTCGGCA 


AGGCGGAGCA 


GTGATGAAGG 


300 


TCGCCGCGCA 


GTGTTCAAAG 


CTCGGATATA 


CGGTGGCACC 


CATGGAACAG 


CGTGCGGAGT 


360 


TGGTGGTTGG 


CCGGGCACTT 


GTCGTCGTCG 


TTGACGATCG 


CACGGCGCAC 


GGCGATGAAG 


420 


ACCACAGCGG 


GCCGCTTGTC 


ACCGAGCTGC 


TCACCGAGGC 


CGGGTTTGTT 


GTCGACGGCG 


480 


TGGTGGCGGT 


GTCGGCCGAC 


GAGGTCGAGA 


TCCGAAATGC 


GCTGAACACA 


GCGGTGATCG 


540 


GCGGGGTGGA 


CCTGGTGGTG 


TCGGTCGGCG 


GGACCGGNGT 


GACGNCTCGC 


GATGTCACCC 


600 


CGGAAGCCAC 


CCGNGACATT 


CT 








622 



(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1200 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 
GGCGCAGCGG TAAGCCTGTT GGCCGCCGGC ACACTGGTGT TGACAGCATG CGGCGGTGGC 
ACCAACAGCT CGTCGTCAGG CGCAGGCGGA ACGTCTGGGT CGGTGCACTG CGGCGGCAAG 
AAGGAGCTCC ACTCCAGCGG CTCGACCGCA CAAGAAAATG CCATGGAGCA GTTCGTCTAT 
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GCCTACGTGC 


GATCGTGCCC 


GGGCTACACG 


TTGGACTACA 


ACGCCAACGG 


GTCCGGTGCC 


240 


GGGGTGACCC 


AGTTTCTCAA 


CAACGAAACC 


GATTTCGCCG 


GCTCGGATGT 


CCCGTTGAAT 


300 


CCGTCGACCG 


GTCAACCTGA 


CCGGTCGGCG 


GAGCGGTGCG 


GTTCCCCGGC 


ATGGGACCTG 


360 


CCGACGGTGT 


TCGGCCCGAT 


CGCGATCACC 


TACAATATCA 


AGGGCGTGAG 


CACGCTGAAT 


420 


CTTGACGGAC 


CCACTACCGC 


CAAGATTTTC 


AACGGCACCA 


TCACCGTGTG 


GAATGATCCA 


480 


CAGATCCAAG 


CCCTCAACTC 


CGGCACCGAC 


CTGCCGCCAA 


CACCGATTAG 


CGTTATCTTC 


540 


CGCAGCGACA 


AGTCCGGTAC 


GTCGGACAAC 


TTCCAGAAAT 


ACCTCGACGG 


TGTATCCAAC 


600 


GGGGCGTGGG 


GCAAAGGCGC 


CAGCGAAACG 


TTCAGCGGGG 


GCGTCGGCGT 


CGGCGCCAGC 


660 


GGGAACAACG 


GAACGTCGGC 


CCTACTGCAG 


ACGACCGACG 


GGTCGATCAC 


CTACAACGAG 


720 


TGGTCGTTTG 


CGGTGGGTAA 


GCAGTTGAAC 


ATGGCCCAGA 


TCATCACGTC 


GGCGGGTCCG 


780 


GATCCAGTGG 


CGATCACCAC 


CGAGTCGGTC 


GGTAAGACAA 


TCGCCGGGGC 


CAAGATCATG 


840 


GGACAAGGCA 


ACGACCTGGT 


ATTGGACACG 


TCGTCGTTCT 


ACAGACCCAC 


CCAGCCTGGC 


900 


TCTTACCCGA 


TCGTGCTGGC 


GACCTATGAG 


ATCGTCTGCT 


CGAAATACCC 


GGATGCGACG 


960 


ACCGGTACTG 


CGGTAAGGGC 


GTTTATGCAA 


GCCGCGATTG 


GTCCAGGCCA 


AGAAGGCCTG 


1020 


GACCAATACG 


GCTCCATTCC 


GTTGCCCAAA 


TCGTTCCAAG 


CAAAATTGGC 


GGCCGCGGTG 


1080 


AATGCTATTT 


CTTGACCTAG 


TGAAGGGAAT 


TCGACGGTGA 


GCGATGCCGT 


TCCGCAGGTA 


1140 


GGGTCGCAAT 


TTGGGCCGTA 


TCAGCTATTG 


CGGCTGCTGG 


GCCGAGGCGG 


GATGGGCGAG 


1200 



(2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1155 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 
GCAAGCAGCT GCAGGTCGTG CTGTTCGACG AACTGGGCAT GCCGAAGACC AAACGCACCA 
AGACCGGCTA CACCACGGAT GCCGACGCGC TGCAGTCGTT GTTCGACAAG ACCGGGCATC 
CGTTTCTGCA ACATCTGCTC GCCCACCGCG ACGTCACCCG GCTCAAGGTC ACCGTCGACG 
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GGTTGCTCCA 


AGCGGTGGCC 


GCCGACGGCC 


GCATCCACAC 


CACGTTCAAC 


CAGACGATCG 


240 


CCGCGACCGG 


CCGGCTCTCC 


TCGACCGAAC 


CCAACCTGCA 


GAACATCCCG 


ATCCGCACCG 


300 


ACGCGGGCCG 


GCGGATCCGG 


GACGCGTTCG 


TGGTCGGGGA 


CGGTTACGCC 


GAGTTGATGA 


360 


CGGCCGACTA 


CAGCCAGATC 


GAGATGCGGA 


TCATGGGGCA 


CCTGTCCGGG 


GACGAGGGCC 


420 


TCATCGAGGC 


GTTCAACACC 


GGGGAGGACC 


TGTATTCGTT 


CGTCGCGTCC 


CGGGTGTTCG 


480 


GTGTGCCCAT 


CGACGAGGTC 


ACCGGCGAGT 


TGCGGCGCCG 


GGTCAAGGCG 


ATGTCCTACG 


540 


GGCTGGTTTA 


CGGGTTGAGC 


GCCTACGGCC 


TGTCGCAGCA 


GTTGAAAATC 


TCCACCGAGG 


600 


AAGCCAACGA 


GCAGATGGAC 


GCGTATTTCG 


CCCGATTCGG 


CGGGGTGCGC 


GACTACCTGC 


660 


GCGCCGTAGT 


CGAGCGGGCC 


CGCAAGGACG 


GCTACACCTC 


GACGGTGCTG 


GGCCGTCGCC 


720 


GCTACCTGCC 


CGAGCTGGAC 


AGCAGCAACC 


GTCAAGTGCG 


GGAGGCCGCC 


gagcgggc4g 

/ 


780 


CGCTGAACGC 


GCCGATCCAG 


GGCAGCGCGG 


CCGACATCAT 


CAAGGTGGCC 


ATGATCCAGG 


840 


TCGACAAGGC 


GCTCAACGAG 


GCACAGCTGG 


CGTCGCGCAT 


GCTGCTGCAG 


GTCCACGACG 


900 


AGCTGCTGTT 


CGAAATCGCC 


CCCGGTGAAC 


GCGAGCGGGT 


CGAGGCCCTG 


GTGCGCGACA 


960 


AGATGGGCGG 


CGCTTACCCG 


CTCGACGTCC 


CGCTGGAGGT 


GTCGGTGGGC 


TACGGCCGCA 


1020 


GCTGGGACGC 


GGCGGCGCAC 


TGAGTGCCGA 


GCGTGCATCT 


GGGGCGGGAA 


TTCGGCGATT 


1080 


TTTCCGCCCT 


GAGTTCACGC 


TCGGCGCAAT 


CGGGACCGAG 


TTTGTCCAGC 


GTGTACCCGT 


1140 


CGAGTAGCCT 


CGTCA 










1155 



(2) INFORMATION FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1771 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 
GAGCGCCGTC TGGTGTTTGA ACGGTTTTAC CGGTCGGCAT CGGCACGGGC GTTGCCGGGT 
TCGGGCCTCG GGTTGGCGAT CGTCAAACAG GTGGTGCTCA ACCACGGCGG ATTGCTGCGC 
ATCGAAGACA CCGACCCAGG CGGCCAGCCC CCTGGAACGT CGATTTACGT GCTGCTCCCC 
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GGCCGTCGGA TGCCGATTCC GCAGCTTCCC GGTGCGACGG 
ATCGAGAACT CTCGGGGTTC GGCGAACGTT ATCTCAGTGG 
ACCTAGTTGT GCAGTTACTG TTGAAAGCCA CACCCATGCC 
GGCCCGAGTA GTGGGCCTAG TACAGGAAGA GCAACCTAGC 
GTATTCGCCA CCGCCGCAGC AGCCGGGAAC CCCAGGTTAT 
GTACAGCCAG CAGTTCGACT GGCGTTACCC ACCGTCCCCG 
CCGTCAACCC TACGAGGCGT TGGGTGGTAC CCGGCCGGGT 
GAG CAT G AC G CCCCCTCCTG GGATGGTTCG CCAACGCCCT 
CGGCGCGGTG ACGATAGCGG TGGTGTCCGC CGGCATCGGC 
CGGGTTCAAC CGGGCACCCG CCGGCCCCAG CGGCGGCCCA 
AAGCATCCCC GCAGCAAACA TGCCGCCGGG GTCGGTCGAA 
GCCCAGTGTC GTCATGTTGG AAACCGATCT GGGCCGCCAG 
CATTCTGTCT GCCGAGGGGC TGATCTTGAC CAACAACCAC 
GCCTCCCCTG GGCAGTCCGC CGCCGAAAAC GACGGTAACC 
ACCCTTCACG GTGGTGGGGG CTGACCCCAC CAGTGATATC 
CGTCTCCGGG CTCACCCCGA TCTCCCTGGG TTCCTCCTCG 
GGTGCTGGCG ATCGGGTCGC CGCTCGGTTT GGAGGGCACC 
CGCTCTCAAC CGTCCAGTGT CGACGACCGG CGAGGCCGGC 
CGCCATTCAG ACCGACGCCG CGATCAACCC CGGTAACTCC 
GAACGCTCAA CTCGTCGGAG TCAACTCGGC CATTGCCACG 
TGCGCAGAGC GGCTCGATCG GTCTCGGTTT TGCGATTCCA 
CGCCGACGAG TTGATCAGCA CCGGCAAGGC GTCACATGCC 
CAATGACAAA GACACCCCGG GCGCCAAGAT CGTCGAAGTA 
GAACGCTGGA GTGCCGAAGG GCGTCGTTGT CACCAAGGTC 
CGCGGACGCG TTGGTTGCCG CCGTGCGGTC CAAAGCGCCG 
CTTTCAGGAT CCCTCGGGCG GTAGCCGCAC AGTGCAAGTC 
GTGATGAAGG TCGCCGCGCA GTGTTCAAAG C 
(2) INFORMATION FOR SEQ ID NO: 14: 



CTGGCGCTCG 
AATCTCAGTC 
AGTCCACGCA 
GACATGACGA 
GCTCAGGGGC 
CCCCCGCAGC 
CTGATACCTG 
CGTGCAGGCA 
GGCGCGGCCG 
GTGGCTGCCA 
CAGGTGGCGG 
TCGGAGGAGG 
GTGATCGCGG 
TTCTCTGACG 
GCCGTCGTCC 
GACCTGAGGG 
GTGACCACGG 
AACCAGAACA 
GGGGGCGCGC 
CTGGGCGCGG 
GTCGACCAGG 
TCCCTGGGTG 
GTGGCCGGTG 
GACGACCGCC 
GGCGCCACGG 
ACCCTCGGCA 



GAGCACGGAC 
CACGCGCGCA 
TGGCCAAGTT 
ATCACCCACG 
AGCAGCAAAC 
CAACCCAGTA 
GCGTGATTCC 
TGTTGGCCAT 



CATCCCTGGT 

/ 

GCGCGGCGCC 
CCAAGGTGGT 
GCTCCGGCAT 
CGGCCGCCAA 
GGCGGACCGC 
GTGTTCAGGG 
TCGGTCAGCC 
GGATCGTCAG 
CCGTGCTGGA 
TGGTGAACAT 
ACTCAGCCGA 
CCAAGCGCAT 
TGCAGGTGAC 
GTGCTGCCGC 
CGATCAACAG 
TGGCGCTAAC 
AGGCGGAGCA 



240 
300 
360 
420 
480 
540 
600 
660 
720 
7 80 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1771 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1058 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:14: 

CTCCACCGCG GTGGCGGCCG CTCTAGAACT AGTGGATCCC CCGGGCTGCA GGAATTCGGC 60 

ACGAGGATCC GACGTCGCAG GTTGTCGAAC CCGCCGCCGC GGAAGTATCG GTCCATGCCT 120 

AGCCCGGCGA CGGCGAGCGC CGGAATGGCG CGAGTGAGGA GGCGGGCAAT TTGGCGGGGC 180 

CCGGCGACGG CGAGCGCCGG AATGGCGCGA GTGAGGAGGC GGGCAGTCAT GCCCAGCGTG 24 0 

ATCCAATCAA CCTGCATTCG GCCTGCGGGC CCATTTGACA ATCGAGGTAG TGAGCGCAAA 300 

TGAATGATGG AAAACGGGCG GTGACGTCCG CTGTTCTGGT GGTGCTAGGT GCCTGCCTGG 360 

CGTTGTGGCT ATCAGGATGT TCTTCGCCGA AACCTGATGC CGAGGAACAG GGTGTTCCCG 4 20 

TGAGCCCGAC GGCGTCCGAC CCCGCGCTCC TCGCCGAGAT CAGGCAGTCG CTTGATGCGA 4 80 

CAAAAGGGTT GACCAGCGTG CACGTAGCGG TCCGAACAAC CGGGAAAGTC GACAGCTTGC 54 0 

TGGGTATTAC CAGTGCCGAT GTCGACGTCC GGGCCAATCC GCTCGCGGCA AAGGGCGTAT 600 

GCACCTACAA C G AC GAG CAG GGTGTCCCGT TTCGGGTACA AG GCGAC AAC ATCTCGGTGA 660 

AACTGTTCGA CGACTGGAGC AATCTCGGCT CGATTTCTGA ACTGTCAACT TCACGCGTGC 7 20 

TCGATCCTGC CGCTGGGGTG ACGCAGCTGC TGTCCGGTGT CACGAACCTC CAAGCGCAAG 7 80 

GTACCGAAGT GATAGACGGA ATTTCGACCA CCAAAATCAC CGGGACCATC CCCGCGAGCT 840 

CTGTCAAGAT GCTTGATCCT GGCGCCAAGA GTGCAAGGCC GGCGACCGTG TGGATTGCCC 900 

AGGACGGCTC GCACCACCTC GTCCGAGCGA GCATCGACCT CGGATCCGGG TCGATTCAGC 960 

TCACGCAGTC GAAATGGAAC GAACCCGTCA ACGTCGACTA GGCCGAAGTT GCGTCGACGC 102 0 

GTTGNTCGAA ACGCCCTTGT GAACGGTGTC AACGGNAC 1058 
(2) INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 542 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
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(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 

GAATTCGGCA CGAGAGGTGA TCGACATCAT CGGGACCAGC CCCACATCCT GGGAACAGGC 60 

GGCGGCGGAG GCGGTCCAGC GGGCGCGGGA TAGCGTCGAT GACATCCGCG TCGCTCGGGT 120 

CATTGAGCAG GACATGGCCG TGGACAGCGC CGGCAAGATC ACCTACCGCA TCAAGCTCGA 180 

AGTGTCGTTC AAGATGAGGC CGGCGCAACC GCGCTAGCAC GGGCCGGCGA GCAAGACGCA 2 40 

AAATCGCACG GTTTGCGGTT GATTCGTGCG ATTTTGTGTC TGCTCGCCGA GGCCTACCAG 300 

GCGCGGCCCA GGTCCGCGTG CTGCCGTATC CAGGCGTGCA TCGCGATTCC GGCGGCCACG 360 

CCGGAGTTAA TGCTTCGCGT CGACCCGAAC TGGGCGATCC GCCGGNGAGC TGATCGATGA 420 

CCGTGGCCAG CCCGTCGATG CCCGAGTTGC CCGAGGAAAC GTGCTGCCAG GCCGGTAGGA 4 80 

AGCGTCCGTA GGCGGCGGTG CTGACCGGCT CTGCCTGCGC CCTCAGTGCG GCCAGCGAGC 54 0 



GG 

(2) INFORMATION FOR SEQ ID NO: 16: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 913 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



542 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 

CGGTGCCGCC CGCGCCTCCG TTGCCCCCAT TGCCGCCGTC GCCGATCAGC TGCGCATCGC 60 

CACCATCACC GCCTTTGCCG CCGGCACCGC CGGTGGCGCC GGGGCCGCCG ATGCCACCGC 120 

TTGACCCTGG CCGCCGGCGC CGCCATTGCC ATACAGCACC CCGCCGGGGG CACCGTTACC 180 

GCCGTCGCCA CCGTCGCCGC CGCTGCCGTT TCAGGCCGGG GAGGCCGAAT GAACCGCCGC 24 0 

CAAGCCCGCC GCCGGCACCG TTGCCGCCTT TTCCGCCCGC CCCGCCGGCG CCGCCAATTG 300 

CCGAACAGCC AMGCACCGTT GCCGCCAGCC CCGCCGCCGT TAACGGCGCT GCCGGGCGCC 360 

GCCGCCGGAC CCGCCATTAC CGCCGTTCCC GTTCGGTGCC CCGCCGTTAC CGGCGCCGCC 4 20 
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GTTTGCCGCC AATATTCGGC GGGCACCGCC AGACCCGCCG GGGCCACCAT TGCCGCCGGG 4 80 

CACCGAAACA ACAGCCCAAC GGTGCCGCCG GCCCCGCCGT TTGCCGCCAT CACCGGCCAT 540 

TCACCGCCAG CACCGCCGTT AATGTTTATG AACCCGGTAC CGCCAGCGCG GCCCCTATTG 600 

CCGGGCGCCG GAGNGCGTGC CCGCCGGCGC CGCCAACGCC CAAAAGCCCG GGGTTGCCAC 660 

CGGCCCCGCC GGACCCACCG GTCCCGCCGA TCCCCCCGTT GCCGCCGGTG CCGCCGCCAT 720 

TGGTGCTGCT GAAGCCGTTA GCGCCGGTTC CGCSGGTTCC GGCGGTGGCG CCNTGGCCGC 7 80 

CGGCCCCGCC GTTGCCGTAC AGCCACCCCC CGGTGGCGCC GTTGCCGCCA TTGCCGCCAT 84 0 

TGCCGCCGTT GCCGCCATTG CCGCCGTTCC CGCCGCCACC GCCGGNTTGG CCGCCGGCGC 900 

CGCCGGCGGC CGC 913 
(2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1872 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 



GACTACGTTG 


GTGTAGAAAA 


ATCCTGCCGC 


CCGGACCCTT 


AAGGCTGGGA 


CAATTTCTGA 


60 


TAGCTACCCC 


GACACAGGAG 


GTTACGGGAT 


GAGCAATTCG 


CGCCGCCGCT 


CACTCAGGTG 


120 


GTCATGGTTG 


CTGAGCGTGC 


TGGCTGCCGT 


CGGGCTGGGC 


CTGGCCACGG 


CGCCGGCCCA 


180 


GGCGGCCCCG 


CCGGCCTTGT 


CGCAGGACCG 


GTTCGCCGAC 


TTCCCCGCGC 


TGCCCCTCGA 


240 


CCCGTCCGCG 


ATGGTCGCCC 


AAGTGGCGCC 


ACAGGTGGTC 


AACATCAACA 


CCAAACTGGG 


300 


CTACAACAAC 


GCCGTGGGCG 


CCGGGACCGG 


CATCGTCATC 


GATCCCAACG 


GTGTCGTGCT 


360 


GACCAACAAC 


CACGTGATCG 


CGGGCGCCAC 


CGACATCAAT 


GCGTTCAGCG 


TCGGCTCCGG 


420 


CCAAACCTAC 


GGCGTCGATG 


TGGTCGGGTA 


TGACCGCACC 


CAGGATGTCG 


CGGTGCTGCA 


480 


GCTGCGCGGT 


GCCGGTGGCC 


TGCCGTCGGC 


GGCGATCGGT 


GGCGGCGTCG 


CGGTTGGTGA 


540 


GCCCGTCGTC 


GCGATGGGCA 


ACAGCGGTGG 


GCAGGGCGGA 


ACGCCCCGTG 


CGGTGCCTGG 


600 


CAGGGTGGTC 


GCGCTCGGCC 


AAACCGTGCA 


GGCGTCGGAT 


TCGCTGACCG 


GTGCCGAAGA 


660 
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GACATTGAAC 


GGGTTGATCC 


AGTTCGATGC 


CGCAATCCAG 


CCCGGTGATT 


CGGGCGGGCC 


720 


CGTCGTCAAC 


GGCCTAGGAC 


AGGTGGTCGG 


TATGAACACG 


GCCGCGTCCG 


ATAACTTCCA 


780 


GCTGTCCCAG 


GGTGGGCAGG 


GATTCGCCAT 


TCCGATCGGG 


CAGGCGATGG 


CGATCGCGGG 


840 


CCAAATCCGA 


TCGGGTGGGG 


GGTCACCCAC 


CGTTCATATC 


GGGCCTACCG 


CCTTCCTCGG 


900 


CTTGGGTGTT 


GTCGACAACA 


ACGGCAACGG 


CGCACGAGTC 


CAACGCGTGG 


TCGGAAGCGC 


960 


TCCGGCGGCA 


AGTCTCGGCA 


TCTCCACCGG 


CGACGTGATC 


ACCGCGGTCG 


ACGGCGCTCC 


1020 


GATCAACTCG 


GCCACCGCGA 


TGGCGGACGC 


GCTTAACGGG 


CATCATCCCG 


GTGACGTCAT 


1080 


CTCGGTGAAC 


TGGCAAACCA 


AGTCGGGCGG 


CACGCGTACA 


GGGAACGTGA 


CATTGGCCGA 


1140 


GGGACCCCCG 


GCCTGATTTG 


TCGCGGATAC 


CACCCGCCGG 


CCGGCCAATT 


GGATTGGCGC 


1200 


CAGCCGTGAT 


TGCCGCGTGA 


GCCCCCGAGT 


TCCGTCTCCC 


GTGCGCGTGG 


CATTGTGGAA 


1260 


GCAATGAACG 


AGGCAGAACA 


CAGCGTTGAG 


CACCCTCCCG 


TGCAGGGCAG 


TTACGTCGAA 


1320 


GGCGGTGTGG 


TCGAGCATCC 


GGATGCCAAG 


GACTTCGGCA 


GCGCCGCCGC 


CCTGCCCGCC 


1380 


GATCCGACCT 


GGTTTAAGCA 


CGCCGTCTTC 


TACGAGGTGC 


TGGTCCGGGC 


GTTCTTCGAC 


1440 


GCCAGCGCGG 


ACGGTTCCGN 


CGATCTGCGT 


GGACTCATCG 


ATCGCCTCGA 


CTACCTGCAG 


1500 


TGGCTTGGCA 


TCGACTGCAT 


CTGTTGCCGC 


CGTTCCTACG 


ACTCACCGCT 


GCGCGACGGC 


1560 


GGTTACGACA 


TTCGCGACTT 


CTACAAGGTG 


CTGCCCGAAT 


TCGGCACCGT 


CGACGATTTC 


1620 


GTCGCCCTGG 


T C AC AO L, <jU 




CXI T A TCCG C A 


TCATCACCGA 


CCTGGTGATG 


1680 


AATCACACCT 


CGGAGTCGCA 


CCCCTGGTTT 


CAGGAGTCCC 


GCCGCGACCC 


AGACGGACCG 


1740 


TACGGTGACT 


ATTACGTGTG 


GAGCGACACC 


AGCGAGCGCT 


ACACCGACGC 


CCGGATCATC 


1800 


TTCGTCGACA 


CCGAAGAGTC 


GAACTGGTCA 


TTCGATCCTG 


TCCGCCGACA 


GTTNCTACTG 


1860 


GCACCGATTC 


TT 










1872 



(2) INFORMATION FOR SEQ ID NO: 18: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 14 82 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
{ D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 
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CTTCGCCGAA ACCTGATGCC GAGGAACAGG GTGTTCCCGT 

CCGCGCTCCT CGCCGAGATC AGGCAGTCGC TTGATGCGAC 

ACGTAGCGGT CCGAACAACC GGGAAAGTCG ACAGCTTGCT 

TCGACGTCCG GGCCAATCCG CTCGCGGCAA AGGGCGTATG 

GTGTCCCGTT TCGGGTACAA GGCGACAACA TCTCGGTGAA 

ATCTCGGCTC GATTTCTGAA CTGTCAACTT CACGCGTGCT 

CGCAGCTGCT GTCCGGTGTC ACGAACCTCC AAGCGCAAGG 

TTTCGACCAC CAAAATCACC GGGACCATCC CCGCGAGCTC 

GCGCCAAGAG TGCAAGGCCG GCGACCGTGT GGATTGCCCA 

TCCGAGCGAG CATCGACCTC GGATCCGGGT CGATTCAGCT 

AACCCGTCAA CGTCGACTAG GCCGAAGTTG CGTCGACGCG 

AACGGTGTCA ACGGCACCCG AAAACTGACC CCCTGACGGC 

GACCGGGCGG TTGGTGGTTA TTCTTCGGTG GTTCCGGCTG 

CGGTCTTTGA GCCGGTAGCT GTOGCCTTTG AGGGCGACGA 

CGGTCGATCA TGGCGGCAGC AACGACGTCG TCGCCGCCGA 

AAGGCCTTAT TGGACGTGAC GATCAAGCTG GCCCGCTCAT 

AAGAAGAGGT TGGCGGCCTC GGGCTCAAAC GGAATGTAAC 

AGCGGATAGC GGCCAAACCG GGTGAGTTCG GCGTAGATGC 

GCGAACCGTG CTACCCATTC GGCGGCGGTG GCGAACAGCA 

GCGCGTATCG CCAGGCCGAC CGCAAGATGA GTCTTCCCGG 

CACGACGTTA TCGCGGGCGG TGATGAAATC CAGGGTGCCC 

TTTGAGGCCA CGAGCATGCT CAAAGTCGAA CTCTTCCAAC 

GGCGGCGCGG ATGCGGCCCT CACCACCATG GGACTCCCGG 

GCAGGCGGCC AGGTATTCTT CGTGGCTCCA GTTCTCGGCG 

GGACACTGAC TCACGCAGGG TGGGAGCTTT CAATGCTCTT 

(2) INFORMATION FOR SEQ ID NO: 19: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 876 base pairs 



GAGCCCGACG 
AAAAGGGTTG 
GGGTATTACC 
CACCTACAAC 
ACTGTTCGAC 
CGATCCTGCC 
TACCGAAGTG 
TGTCAAGATG 
GGACGGCTCG 
CACGCAGTCG 
TTGCTCGAAA 
ATCTGAAAAT 
GTGGGACGCG 
CTTCAGCATG 
AAACCTCGCC 
ACCGGGAGGA 
CGACTTCGTC 
GCCCGGCGTG 
CCCGATGACC 
TGCCAGGCGG 
AGATGTGCGA 
GACTTCCGAA 
GCTGACACTT 
CGGGCGCGAT 
GT 



GCGTCCGACC 
ACCAGCGTGC 
AGTGCCGATG 
GACGAGCAGG 
GACTGGAGCA 
GCTGGGGTGA 
ATAGACGGAA 
CTTGATCCTG 
CACCACCTJCG 
AAATGGAACG 
CGCCCTTGTG 
TGACCCCCTA 
GCCGAGGTCG 
GTGGACGAGG 
CCACCGGCCG 
CACCAGCTGG 
AACCACCAGG 
GTGAGCCTCG 
GGCCTGACAC 
GGCCCAAAAA 
TGGTGTCGCG 
CCGGGAAGCG 
CCCGCTGCAG 
CGGCCAGCCG 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1482 
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(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 

GAATTCGGCA CGAGCCGGCG ATAGCTTCTG GGCCGCGGCC GACCAGATGG CTCGAGGGTT 60 

CGTGCTCGGG GCCACCGCCG GGCGCACCAC CCTGACCGGT GAGGGCCTGC AACACGCCGA 120 

CGGTCACTCG TTGCTGCTGG ACGCCACCAA CCCGGCGGTG GTTGCCTACG ACCCGGCCTT 180 

CGCCTACGAA ATCGGCTACA TCGNGGAAAG CGGACTGGCC AGGATGTGCG GGGAGAACCC 240 

I 

GGAGAACATC TTCTTCTACA TCACCGTCTA CAACGAGCCG TACGTGCAGC CGCCGGAGCC 300 

GGAGAACTTC GATCCCGAGG GCGTGCTGGG GGGTATCTAC CGNTATCACG CGGCCACCGA 360 

GCAACGCACC AACAAGGNGC AGATCCTGGC CTCCGGGGTA GCGATGCCCG CGGCGCTGCG 4 20 

GGCAGCACAG ATGCTGGCCG CCGAGTGGGA TGTCGCCGCC GACGTGTGGT CGGTGACCAG 4 80 

TTGGGGCGAG CTAAACCGCG ACGGGGTGGT CATCGAGACC GAGAAGCTCC GCCACCCCGA 54 0 

TCGGCCGGCG GGCGTGCCCT ACGTGACGAG AGCGCTGGAG AATGCTCGGG GCCCGGTGAT 600 

CGCGGTGTCG GACTGGATGC GCGCGGTCCC CGAGCAGATC CGACCGTGGG TGCCGGGCAC 660 

ATACCTCACG TTGGGCACCG ACGGGTTCGG TTTTTCCGAC ACTCGGCCCG CCGGTCGTCG 720 

TTACTTCAAC ACCGACGCCG AATCCCAGGT TGGTCGCGGT TTTGGGAGGG GTTGGCCGGG 780 

TCGACGGGTG AATATCGACC CATTCGGTGC CGGTCGTGGG CCGCCCGCCC AGTTACCCGG 84 0 

ATTCGACGAA GGTGGGGGGT TGCGCCCGAN TAAGTT 87 6 
(2) INFORMATION FOR SEQ ID NO: 20: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1021 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20: 
ATCCCCCCGG GCTGCAGGAA TTCGGCACGA GAGACAAAAT TCCACGCGTT AATGCAGGAA 60 
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CAGATTCATA ACGAATTCAC AGCGGCACAA CAATATGTCG CGATCGCGGT TTATTTCGAC 
AGCGAAGACC TGCCGCAGTT GGCGAAGCAT TTTTACAGCC AAGCGGTCGA GGAACGAAAC 
CATGCAATGA TGCTCGTGCA ACACCTGCTC GACCGCGACC TTCGTGTCGA AATTCCCGGC 
GTAGACACGG TGCGAAACCA GTTCGACAGA CCCCGCGAGG CACTGGCGCT GGCGCTCGAT 
CAGGAACGCA CAGTCACCGA CCAGGTCGGT CGGCTGACAG CGGTGGCCCG CGACGAGGGC 
GATTTCCTCG GCGAGCAGTT CATGCAGTGG TTCTTGCAGG AACAGATCGA AGAGGTGGCC 
TTGATGGCAA CCCTGGTGCG GGTTGCCGAT CGGGCCGGGG CCAACCTGTT CGAGCTAGAG 
AACTTCGTCG CACGTGAAGT GGATGTGGCG CCGGCCGCAT CAGGCGCCCC GCACGCTGCC 
GGGGGCCGCC TCTAGATCCC TGGGGGGGAT CAGCGAGTGG TCCCGTTCGC CCGCCCGTCT 
TCCAGCCAGG CCTTGGTGCG GCCGGGGTGG TGAGTACCAA TCCAGGCCAC CCCGACCTCC 
CGGNAAAAGT CGATGTCCTC GTACTCATCG ACGTTCCAGG AGTACACCGC CCGGCCCTGA 
GCTGCCGAGC GGTCAACGAG TTGCGGATAT TCCTTTAACG CAGGCAGTGA GGGTCCCACG 
GCGGTTGGCC CGACCGCCGT GGCCGCACTG CTGGTCAGGT ATCGGGGGGT CTTGGCGAGC 
AACAACGTCG GCAGGAGGGG TGGAGCCCGC CGGATCCGCA GACCGGGGGG GCGAAAACGA 
CATCAACACC GCACGGGATC GATCTGCGGA GGGGGGTGCG GGAATACCGA ACCGGTGTAG 
GAGCGCCAGC AGTTGTTTTT CCACCAGCGA AGCGTTTTCG GGTCATCGGN GGCNNTTAAG 
T 

(2) INFORMATION FOR SEQ ID NO: 21: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 321 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1021 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 21: 
CGTGCCGACG AACGGAAGAA CACAACCATG AAGATGGTGA AATCGATCGC CGCAGGTCTG 
ACCGCCGCGG CTGCAATCGG CGCCGCTGCG GCCGGTGTGA CTTCGATCAT GGCTGGCGGN 
CCGGTCGTAT ACCAGATGCA GCCGGTCGTC TTCGGCGCGC CACTGCCGTT GGACCCGGNA 



BNSDOCID: <WO_9816645A2_l_> 



WO 98/1664S 



77 



PCT/US97/18214 



TCCGCCCCTG ANGTCCCGAC CGCCGCCCAG TGGACCAGNC TGCTCAACAG NCTCGNCGAT 
CCCAACGTGT CGTTTGNGAA CAAGGGNAGT CTGGTCGAGG GNGGNATCGG NGGNANCGAG 
GGNGNGNATC GNCGANCACA A 
(2) INFORMATION FOR SEQ ID NO: 22: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 373 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 22: 



TCTTATCGGT 


TCCGGTTGGC 


GACGGGTTTT 


GGGNGCGGGT 


GGTTAACCCG 


CTCGGCCAGC 


60 


CGATCGACGG 


GCGCGGAGAC 


GTCGACTCCG 


ATACTCGGCG 


CGCGCTGGAG 


CTCCAGGCGC 


120 


CCTCGGTGGT 


GNACCGGCAA 


GGCGTGAAGG 


AGCCGTTGNA 


GACCGGGATC 


AAGGCGATTG 


180 


ACGCGATGAC 


CCCGATCGGC 


CGCGGGCAGC 


GCCAGCTGAT 


CATCGGGGAC 


CGCAAGACCG 


240 


GCAAAAACCG 


CCGTCTGTGT 


CGGACACCAT 


CCTCAAACCA 


GCGGGAAGAA 


CTGGGAGTCC 


300 


GGTGGATCCC 


AAGAAGCAGG 


TGCGCTTGTG 


TATACGTTGG 


CCATCGGGCA 


AGAAGGGGAA 


360 


CT T AC CAT CG 


CCG 










373 



(2) INFORMATION FOR SEQ ID NO: 23: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 352 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY : linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:23: 
GTGACGCCGT GATGGGATTC CTGGGCGGGG CCGGTCCGCT GGCGGTGGTG GATCAGCAAC 
TGGTTACCCG GGTGCCGCAA GGCTGGTCGT TTGCTCAGGC AGCCGCTGTG CCGGTGGTGT 
TCTTGACGGC CTGGTACGGG TTGGCCGATT TAGCCGAGAT CAAGGCGGGC GAATCGGTGC 
TGATCCATGC CGGTACCGGC GGTGTGGGCA TGGCGGCTGT GCAGCTGGCT CGCCAGTGGG 
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GCGTGGAGGT TTTCGTCACC GCCAGCCGTG GNAAGTGGGA CACGCTGCGC GCCATNGNGT 300 
TTGACGACGA NCCATATCGG NGATTCCCNC ACATNCGAAG TTCCGANGGA GA 352 
(2) INFORMATION FOR SEQ ID NO: 24: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 726 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 24: f 

/ 



GAAATCCGCG 


TTCATTCCGT 


TCGACCAGCG 


GCTGGCGATA 


ATCGACGAAG 


TGATCAAGCC 

/ 


60 


GCGGTTCGCG 


GCGCTCATGG 


GTCACAGCGA 


GTAATCAGCA 


AGTTCTCTGG 


TATATCGCAC 


120 


CTAGCGTCCA 


GTTGCTTGCC 


AGATCGCTTT 


CGTACCGTCA 


TCGCATGTAC 


CGGTTCGCGT 


180 


GCCGCACGCT 


CATGCTGGCG 


GCGTGCATCC 


TGGCCACGGG 


TGTGGCGGGT 


CTCGGGGTCG 


240 


GCGCGCAGTC 


CGCAGCCCAA 


ACCGCGCCGG 


TGCCCGACTA 


CTACTGGTGC 


CCGGGGCAGC 


300 


CTTTCGACCC 


CGCATGGGGG 


CCCAACTGGG 


ATCCCTACAC 


CTGCCATGAC 


GACTTCCACC 


360 


GCGACAGCGA 


CGGCCCCGAC 


CACAGCCGCG 


ACTACCCCGG 


ACCCATCCTC 


GAAGGTCCCG 


420 


TGCTTGACGA 


TCCCGGTGCT 


GCGCCGCCGC 


CCCCGGCTGC 


CGGTGGCGGC 


GCATAGCGCT 


480 


CGTTGACCGG 


GCCGCATCAG 


CGAATACGCG 


TATAAACCCG 


GGCGTGCCCC 


CGGCAAGCTA 


540 


CGACCCCCGG 


CGGGGCAGAT 


TTACGCTCCC 


GTGCCGATGG 


ATCGCGCCGT 


CCGATGACAG 


600 


AAAATAGGCG 


ACGGTTTTGG 


CAACCGCTTG 


GAGGACGCTT 


GAAGGGAACC 


TGTCATGAAC 


660 


GGCGACAGCG 


CCTCCACCAT 


CGACATCGAC 


AAGGTTGTTA 


CCCGCACACC 


CGTTCGCCGG 


720 



ATCGTG 

(2) INFORMATION FOR SEQ ID NO: 25: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 580 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 25: 

CGCGACGACG ACGAACGTCG GGCCCACCAC CGCCTATGCG TTGATGCAGG CGACCGGGAT 60 

GGTCGCCGAC CATATCCAAG CATGCTGGGT GCCCACTGAG CGACCTTTTG ACCAGCCGGG 120 

CTGCCCGATG GCGGCCCGGT GAAGTCATTG CGCCGGGGCT TGTGCACCTG ATGAACCCGA 18 0 

ATAGGGAACA AT AGGGGGGT GATTTGGCAG TTCAATGTCG GGTATGGCTG GAAATCCAAT 24 0 

GGCGGGGCAT GCTCGGCGCC GACCAGGCTC GCGCAGGCGG GCCAGCCCGA ATCTGGAGGG 300 

AGCACTCAAT GGCGGCGATG AAGCCCCGGA CCGGCGACGG TCCTTTGGAA GCAACTAAGG 360 

AGGGGCGCGG CATTGTGATG CGAGTACCAC TTGAGGGTGG CGGTCGCCTG GTCGTCGAGC 4 20 

TGACACCCGA CGAAGCCGCC GCACTGGGTG ACGAACTCAA AGGCGTTACT AGCTAAGACC 48 0 

AGCCCAACGG CGAATGGTCG GCGTTACGCG CACACCTTCC GGTAGATGTC CAGTGTCTGC 54 0 

TCGGCGATGT ATGCCCAGGA GAACTCTTGG ATACAGCGCT 580 
(2) INFORMATION FOR SEQ ID NO:26: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 160 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 26: 
AACGGAGGCG CCGGGGGTTT TGGCGGGGCC GGGGCGGTCG GCGGCAACGG CGGGGCCGGC 60 
GGTACCGCCG GGTTGTTCGG TGTCGGCGGG GCCGGTGGGG CCGGAGGCAA CGGCATCGCC 120 
GGTGTCACGG GTACGTCGGC CAGCACACCG GGTGGATCCG 160 
(2) INFORMATION FOR SEQ ID NO: 27: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: -272 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 27: 
GACACCGATA CGATGGTGAT GTACGCCAAC GTTGTCGACA CGCTCGAGGC GTTCACGATC 
CAGCGCACAC CCGACGGCGT GACCATCGGC GATGCGGCCC CGTTCGCGGA GGCGGCTGCC 
AAGGCGATGG GAATCGACAA GCTGCGGGTA ATTCATACCG GAATGGACCC CGTCGTCGCT 
GAACGCGAAC AGTGGGACGA CGGCAACAAC ACGTTGGCGT TGGCGCCCGG TGTCGTTGTC 
GCCTACGAGC GCAACGTACA GACCAACGCC CG 
(2) INFORMATION FOR SEQ ID NO: 28: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 317 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 28: 

GCAGCCGGTG GTTCTCGGAC TATCTGCGCA CGGTGACGCA GCGCGACGTG CGCGAGCTGA 60 

AGCGGATCGA GCAGACGGAT CGCCTGCCGC GGTTCATGCG CTACCTGGCC GCTATCACCG 120 

CGCAGGAGCT GAACGTGGCC GAAGCGGCGC GGGTCATCGG GGTCGACGCG GGGACGATCC 180 

GTTCGGATCT GGCGTGGTTC GAGACGGTCT ATCTGGTACA TCGCCTGCCC GCCTGGTCGC 24 0 

GGAATCTGAC CGCGAAGATC AAGAAGCGGT CAAAGATCCA CGTCGTCGAC AGTGGCTTCG 300 

CGGCCTGGTT GCGCGGG 317 
(2) INFORMATION FOR SEQ ID NO: 29: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 182 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



60 
120 
180 
240 
272 



/ 

/ 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 29: 
GATCGTGGAG CTGTCGATGA ACAGCGTTGC CGGACGCGCG GCGGCCAGCA CGTCGGTGTA 60 
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GCAGCGCCGG ACCACCTCGC CGGTGGGCAG CATGGTGATG ACCACGTCGG CCTCGGCCAC 120 
CGCTTCGGGC GCGCTACGAA ACACCGCGAC ACCGTGCGCG GCGGCGCCGG ACGCCGCCGT 180 



ACGTTTGG 

(2) INFORMATION FOR SEQ ID NO: 31: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 267 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



182 



GG 

(2) INFORMATION FOR SEQ ID NO: 30: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 308 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 30: / 

GATCGCGAAG TTTGGTGAGC AGGTGGTCGA CGCGAAAGTC TGGGCGCCTG CGAAGCGGGT 60 

CGGCGTTCAC GAGGCGAAGA CACGCCTGTC CGAGCTGCTG CGGCTCGTCT ACGGCGGGCA 120 

GAGGTTGAGA TTGCCCGCCG CGGCGAGCCG GTAGCAAAGC TTGTGCCGCT GCATCCTCAT 180 

GAGACTCGGC GGTTAGGCAT TGACCATGGC GTGTACCGCG TGCCCGACGA TTTGGACGCT 24 0 

CCGTTGTCAG ACGACGTGCT CGAACGCTTT CACCGGTGAA GCGCTACCTC ATCGACACCC 300 



308 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 31: 
CCGACGACGA GCAACTCACG TGGATGATGG TCGGCAGCGG CATTGAGGAC GGAGAGAATC 
CGGCCGAAGC TGCCGCGCGG CAAGTGCTCA TAGTGACCGG CCGTAGAGGG CTCCCCCGAT 
GGCACCGGAC TATTCTGGTG TGCCGCTGGC CGGTAAGAGC GGGTAAAAGA ATGTGAGGGG 
ACACGATGAG CAATCACACC TACCGAGTGA - TCGAGATCGT CGGGACCTCG CCCGACGGCG 
TCGACGCGGC AATCCAGGGC GGTCTGG 
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(2) INFORMATION FOR SEQ ID NO: 32: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1539 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 32: 
CTCGTGCCGA AAGAATGTGA GGGGACACGA TGAGCAATCA CACCTACCGA GTGATCGAGA 60 

TCGTCGGGAC CTCGCCCGAC GGCGTCGACG CGGCAATCCA GGGCGGTCTG GCCCGAGCTG 120 

CGCAGACCAT GCGCGCGCTG GACTGGTTCG AAGTACAGTC AATTCGAGGC CACCTGGTCG 180 

ACGGAGCGGT CGCGCACTTC CAGGTGACTA TGAAAGTCGG CTTCCGCTGG AGGATTCCTG 24 0 

AACCTTCAAG CGCGGCCGAT AACTGAGGTG CAT CATTAAG CGACTTTTCC AG AAC AT C C T 300 

GACGCGCTCG AAACGCGGTT CAGCCGACGG TGGCTCCGCC GAGGCGCTGC CTCCAAAATC 360 

CCTGCGACAA TTCGTCGGCG GCGCCTACAA GGAAGTCGGT GCTGAATTCG TCGGGTATCT 4 20 

GGTCGACCTG TGTGGGCTGC AGCCGGACGA AGCGGTGCTC GACGTCGGCT GCGGCTCGGG 480 

GCGGATGGCG TTGCCGCTCA CCGGCTATCT G AAC AG C GAG GGACGCTACG CCGGCTTCGA 54 0 

TATCTCGCAG AAAGCCATCG CGTGGTGCCA GGAGCACATC ACCTCGGCGC ACCCCAACTT 600 

CCAGTTCGAG GTCTCCGACA TCTACAACTC GCTGTACAAC CCGAAAGGGA AATACCAGTC 660 

ACTAGACTTT CGCTTTCCAT ATCCGGATGC GTCGTTCGAT GTGGTGTTTC TTACCTCGGT 7 20 

GTTCACCCAC ATGTTTCCGC CGGACGTGGA GCACTATCTG GACGAGATCT CCCGCGTGCT 780 

GAAGCCCGGC GGACGATGCC TGTGCACGTA CTTCTTGCTC AAT G AC GAG T CGTTAGCCCA 840 
CATCGCGGAA GGAAAGAGTG CGCACAACTT CCAGCATGAG GGACCGGGTT ATCGGACAAT 900 
CCACAAGAAG CGGCCCGAAG AAGCAATCGG CTTGCCGGAG ACCTTCGTCA GGGATGTCTA 960 

TGGCAAGTTC GGCCTCGCCG TGCACGAACC ATTGCACTAC GGCTCATGGA GTGGCCGGGA 1020 

ACCACGCCTA AGCTTCCAGG ACATCGTCAT CGCGACCAAA ACCGCGAGCT AGGTCGGCAT 1080 

CCGGGAAGCA TCGCGACACC GTGGCGCCGA GCGCCGCTGC CGGCAGGCCG ATTAGGCGGG 114 0 

CAGATTAGCC CGCCGCGGCT CCCGGCTCCG AGTACGGCGC CCCGAATGGC GTCACCGGCT 1200 

GGTAACCACG CTTGCGCGCC TGGGCGGCGG CCTGCCGGAT CAGGTGGTAG ATGCCGACAA 1260 
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AGCCTGCGTG 


ATCGGTCATC ACCAACGGTG ACAGCAGCCb 




AGCGCGAACG 


1320 


CCACCCCGGT 


CTCCGGGTCT GTCCAGCCGA TCGAGCCGCC 


CAAGCCCACA 


TGACCAAACC 


1380 


CCGGCATCAC 


GTTGCCGATC GGCATACCGT GATAGCCAAG 


ATGAAAATTT 


AAGGGCACCA 


1440 


ATAGATTTCG 


ATCCGGCAGA ACTTGCCGTC GGTTGCGGGT 


CAGGCCCGTG 


ACCAGCTCCC 


1500 


GCGACAAGAA 


CCGTATGCCG TCGATCTCGC CTCGTGCCG 






1539 



(2) INFORMATION FOR SEQ ID NO: 33: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 851 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 33: 



CTGCAGGGTG 


GCGTGGATGA 


GCGTCACCGC 


GGGGCAGGCC 


GAGCTGACCG 


CCGCCCAGGT 


60 


CCGGGTTGCT 


GCGGCGGCCT 


ACGAGACGGC 


GTATGGGCTG 


ACGGTGCCCC 


CGCCGGTGAT 


120 


CGCCGAGAAC 


CGTGCTGAAC 


TGATGATTCT 


GATAGCGACC 


AACCTCTTGG 


GGCAAAACAC 


180 


CCCGGCGATC 


GCGGTCAACG 


AGGCCGAATA 


CGGCGAGATG 


TGGGCCCAAG 


ACGCCGCCGC 


240 


GATGTTTGGC 


TACGCCGCGG 


CGACGGCGAC 


GGCGACGGCG 


ACGTTGCTGC 


CGTTCGAGGA 


300 


GGCGCCGGAG 


ATGACCAGCG 


CGGGTGGGCT 


CCTCGAGCAG 


GCCGCCGCGG 


TCGAGGAGGC 


360 


CTCCGACACC 


GCCGCGGCGA 


ACCAGTTGAT 


GAACAATGTG 


CCCCAGGCGC 


TGAAACAGTT 


420 


GGCCCAGCCC 


ACGCAGGGCA 


CCACGCCTTC 


TTCCAAGCTG 


GGTGGCCTGT 


GGAAGACGGT 


480 


CTCGCCGCAT 


CGGTCGCCGA 


TCAGCAACAT 


GGTGTCGATG 


GCCAACAACC 


ACATGTCGAT 


540 


GACCAACTCG 


GGTGTGTCGA 


TGACCAACAC 


CTTGAGCTCG 


ATGTTGAAGG 


GCTTTGCTCC 


600 


GGCGGCGGCC 


GCCCAGGCCG 


TGCAAACCGC 


GGCGCAAAAC 


GGGGTCCGGG 


CGATGAGCTC 


660 


GCTGGGCAGC 


TCGCTGGGTT 


CTTCGGGTCT 


GGGCGGTGGG 


GTGGCCGCCA 


ACTTGGGTCG 


720 


GGCGGCCTCG 


GTACGGTATG 


GTCACCGGGA 


TGGCGGAAAA 


TATGCANAGT 


CTGGTCGGCG 


780 


GAACGGTGGT 


CCGGCGTAAG 


GTTTACCCCC 


GTTTTCTGGA 


TGCGGTGAAC 


TTCGTCAACG 


840 


GAAACAGTTA 


C 










851 
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(2) INFORMATION FOR SEQ ID NO: 34: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 254 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 34: 

GATCGATCGG GCGGAAATTT GGACCAGATT CGCCTCCGGC GATAACCCAA TCAATCGAAC 60 

CTAGATTTAT TCCGTCCAGG GGCCCGAGTA ATGGCTCGCA GGAGAGGAAC CTTACTGCTG 120 

CGGGCACCTG TCGTAGGTCC TCGATACGGC GGAAGGCGTC GACATTTTCC ACCGACACCC 180 

CCATCCAAAC GTTCGAGGGC CACTCCAGCT TGTGAGCGAG GCGACGCAGT CGCAGGCTGC 24 0 

GCTTGGTCAA GATC 254 
(2) INFORMATION FOR SEQ ID NO: 35: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1227 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

( D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 35: 

GATCCTGACC GAAGCGGCCG CCGCCAAGGC GAAGTCGCTG TTGGACCAGG AGGGACGGGA 60 

CGATCTGGCG CTGCGGATCG CGGTTCAGCC GGGGGGGTGC GCTGGATTGC GCTATAACCT 120 

TTTCTTCGAC GACCGGACGC TGGATGGTGA CCAAACCGCG GAGTTCGGTG GTGTCAGGTT 180 

GATCGTGGAC CGGATGAGCG CGCCGTATGT GGAAGGCGCG TCGATCGATT TCGTCGACAC 24 0 

TATTGAGAAG CAAGGTTCAC CATCGACAAT CCCAACGCCA CCGGCTCCTG CGCGTGCGGG 300 

GATTCGTTCA ACTGATAAAA CGCTAGTACG ACCCCGCGGT GCGCAACACG TACGAGCACA 360 

CCAAGACCTG ACCGCGCTGG AAAAGCAACT GAGCGATGCC. TTGCACCTGA CCGCGTGGCG 4 20 

GGCCGCCGGC GGCAGGTGTC ACCTGCATGG TGAACAGCAC CTGGGCCTGA TATTGCGACC 4 80 

AGTACACGAT TTTGTCGATC GAGGTCACTT CGACCTGGGA GAACTGCTTG CGGAACGCGT 54 0 
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CGCTGCTCAG CTTGGCCAAG GCCTGATCGG AGCGCTTGTC GCGCACGCCG TCGTGGATAC 600 

CGCACAGCGC ATTGCGAACG ATGGTGTCCA CATCGCGGTT CTCCAGCGCG TTGAGGTATC 660 

CCTGAATCGC GGTTTTGGCC GGTCCCTCCG AGAATGTGCC TGCCGTGTTG GCTCCGTTGG 720 

TGCGGACCCC GTATATGATC GCCGCCGTCA TAGCCGACAC CAGCGCGAGG GCTACCACAA 78 0 

TGCCGATCAG CAGCCGCTTG TGCCGTCGCT TCGGGTAGGA CACCTGCGGC GGCACGCCGG 84 0 

GATATGCGGC GGGCGGCAGC GCCGCGTCGT CTGCCGGTCC CGGGGCGAAG GCCGGTTCGG 900 

CGGCGCCGAG GTCGTGGGGG TAGTCCAGGG CTTGGGGTTC GTGGG ATGAG GGCTCGGGGT 960 

ACGGCGCCGG TCCGTTGGTG CCGACACCGG GGTTCGGCGA GTGGGGACCG GGCATTGTGG 1020 

TTCTCCTAGG GTGGTGGACG GGACCAGCTG CTAGGGCGAC AACCGCCCGT CGCGTCAGCC 1080 

GGCAGCATCG GCAATCAGGT GAGCTCCCTA GGCAGGCTAG CGCAACAGCT GCCGTCAGCT 1140 

CTCAACGCGA CGGGGCGGGC CGCGGCGCCG ATAATGTTGA AAGACTAGGC AACCTTAGGA 1200 

ACGAAGGACG GAGATTTTGT GACGATC 1227 
(2) INFORMATION FOR SEQ ID NO: 36: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 181 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:36: 

GCGGTGTCGG CGGATCCGGC GGGTGGTTGA ACGGCAACGG CGGGGCCGGC GGGGCCGGCG 60 

GGACCGGCGC TAACGGTGGT GCCGGCGGCA ACGCCTGGTT GTTCGGGGCC GGCGGGTCCG 120 

GCGGNGCCGG CACCAATGGT GGNGTCGGCG GGTCCGGCGG ATTTGTCTAC GGCAACGGCG 180 



G 

(2) INFORMATION FOR SEQ ID NO: 37: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 290 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



181 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 37: 

GCGGTGTCGG CGGATCCGGC GGGTGGTTGA ACGGCAACGG CGGTGTCGGC GGCCGGGGCG 60 

GCGACGGCGT CTTTGCCGGT GCCGGCGGCC AGGGCGGCCT CGGTGGGCAG GGCGGCAATG 120 

GCGGCGGCTC CACCGGCGGC AACGGCGGTC TTGGCGGCGC GGGCGGTGGC GGAGGCAACG 180 

CCCCGGACGG CGGCTTCGGT GGCAACGGCG GTAAGGGTGG CCAGGGCGGN ATTGGCGGCG 24 0 

GCACTCAGAG CGCGACCGGC CTCGGNGGTG ACGGCGGTGA CGGCGGTGAC 2 90 
(2) INFORMATION FOR SEQ ID NO: 38: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 34 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 38: 
GATCCAGTGG CATGGNGGGT GTCAGTGGAA GCAT 
(2) INFORMATION FOR SEQ ID NO: 39: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 155 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 39: 
GATCGCTGCT CGTCCCCCCC TTGCCGCCGA CGCCACCGGT CCCACCGTTA CCGAACAAGC 60 
TGGCGTGGTC GCCAGCACCC CCGGCACCGC CGACGCCGGA GTCGAACAAT GGCACCGTCG 120 
TATCCCCACC ATTGCCGCCG GNCCCACCGG CACCG 155 
(2) INFORMATION FOR SEQ ID NO: 40: 
(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 53 base pairs 

(B) TYPE: nucleic acid 

( C ) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 40: 
ATGGCGTTCA CGGGGCGCCG GGGACCGGGC AGCCCGGNGG GGCCGGGGGG TGG 
(2) INFORMATION FOR SEQ ID NO: 41: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 132 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 41: 
GATCCACCGC GGGTGCAGAC GGTGCCCGCG GCGCCACCCC GACCAGCGGC GGCAACGGCG 60 
GCACCGGCGG CAACGGCGCG AACGCCACCG TCGTCGGNGG GGCCGGCGGG GCCGGCGGCA 120 
AGGGCGGCAA CG 132 
(2) INFORMATION FOR SEQ ID NO: 42: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 132 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 42: 
GATCGGCGGC CGGNACGGNC GGGGACGGCG GCAAGGGCGG NAACGGGGGC GCCGNAGCCA 60 
CCNGCCAAGA ATCCTCCGNG TCCNCCAATG GCGCGAATGG CGGACAGGGC GGCAACGGCG 120 
GCANCGGCGG CA 132 
(2) INFORMATION FOR SEQ ID NO: 43: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 702 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 43: 

CGGCACGAGG ATCGGTACCC CGCGGCATCG GCAGCTGCCG ATTCGCCGGG TTTCCCCACC 60 

CGAGGAAAGC CGCTACCAGA TGGCGCTGCC GAAGTAGGGC GATCCGTTCG CGATGCCGGC 120 

ATGAACGGGC GGCATCAAAT TAGTGCAGGA ACCTTTCAGT TTAGCGACGA TAATGGCTAT 18 0 

AGCACTAAGG AG GAT GAT CC GATATGACGC AGTCGCAGAC CGTGACGGTG GATCAGCA^G 24 0 

AGATTTTGAA CAGGGCCAAC GAGGTGGAGG CCCCGATGGC GGACCCACCG ACTGATGTCC 300 

CCATCACACC GTGCGAACTC ACGGNGGNTA AAAACGCCGC CCAACAGNTG GTNTTGTCCG 360 

CCGACAACAT GCGGGAATAC CTGGCGGCCG GTGCCAAAGA GCGGCAGCGT CTGGCGACCT 4 20 

CGCTGCGCAA CGCGGCCAAG GNGTATGGCG AGGTTGATGA GGAGGCTGCG ACCGCGCTGG 4 80 

ACAACGACGG CGAAGGAACT GTGCAGGCAG AATCGGCCGG GGCCGTCGGA GGGGACAGTT 54 0 

CGGCCGAACT AACCGATACG CCGAGGGTGG CCACGGCCGG TGAACCCAAC TTCATGGATC 600 

TCAAAGAAGC GGCAAGGAAG CTCGAAACGG GCGACCAAGG CGCATCGCTC GCGCACTGNG 660 

GGGATGGGTG GAACACTTNC ACCCTGACGC TGCAAGGCGA CG 7 02 
(2) INFORMATION FOR SEQ ID NO: 44: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 298 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 44: 

GAAGCCGCAG CGCTGTCGGG CGACGTGGCG GTCAAAGCGG CATCGCTCGG TGGCGGTGGA 60 

GGCGGCGGGG TGCCGTCGGC GCCGTTGGGA TCCGCGATCG GGGGCGCCGA ATCGGTGCGG 120 

CCCGCTGGCG CTGGTGACAT TGCCGGCTTA GGCCAGGGAA GGGCCGGCGG CGGCGCCGCG 180 



BNSDOCID: <WO 9816645A2_I_> 



* 0 , WO 98/16645 PCT/US97/18214 

89 



CTGGGCGGCG GTGGCATGGG AATGCCGATG GGTGCCGCGC ATCAGGGACA AGGGGGCGCC 
AAGTCCAAGG GTTCTCAGCA GGAAGACGAG GCGCTCTACA CCGAGGATCC TCGTGCCG 
(2) INFORMATION FOR SEQ ID NO: 45: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1058 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

( D) TOPOLOGY : linear 



7 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 45: 



CGGCACGAGG ATCGAATCGC GTCGCCGGGA GCACAGCGTC GCACTGCACC AGTGGAGGAG , 60 

CCATGACCTA CTCGCCGGGT AACCCCGGAT ACCCGCAAGC GCAGCCCGCA GGCTCCTACG 120 

GAGGCGTCAC ACCCTCGTTC GCCCACGCCG ATGAGGGTGC GAGCAAGCTA CCGATGTACC 180 

TGAACATCGC GGTGGCAGTG CTCGGTCTGG CTGCGTACTT CGCCAGCTTC GGCCCAATGT 24 0 

TCACCCTCAG TACCGAACTC GGGGGGGGTG ATGGCGCAGT GTCCGGTGAC ACTGGGCTGC 300 

CGGTCGGGGT GGCTCTGCTG GCTGCGCTGC TTGCCGGGGT GGTTCTGGTG CCTAAGGCCA 360 

AGAGCCATGT GACGGTAGTT GCGGTGCTCG GGGTACTCGG CGTATTTCTG ATGGTCTCGG 4 20 

CGACGTTTAA CAAGCCCAGC GCCTATTCGA CCGGTTGGGC ATTGTGGGTT GTGTTGGCTT 4 80 

TCATCGTGTT CCAGGCGGTT GCGGCAGTCC TGGCGCTCTT GGTGGAGACC GGCGCTATCA 54 0 

CCGCGCCGGC GCCGCGGCCC AAGTTCGACC CGTATGGACA GTACGGGCGG TACGGGCAGT 600 

ACGGGCAGTA CGGGGTGCAG CCGGGTGGGT ACTACGGTCA GCAGGGTGCT CAGCAGGCCG 660 

CGGGACTGCA GTCGCCCGGC CCGCAGCAGT CTCCGCAGCC TCCCGGATAT GGGTCGCAGT 720 

ACGGCGGCTA TTCGTCCAGT CCGAGCCAAT CGGGCAGTGG ATACACTGCT CAGCCCCCGG 780 

CCCAGCCGCC GGCGCAGTCC GGGTCGCAAC AATCGCACCA GGGCCCATCC ACGCCACCTA 84 0 

CCGGCTTTCC GAGCTTCAGC CCACCACCAC CGGTCAGTGC CGGGACGGGG TCGCAGGCTG 900 

GTTCGGCTCC AGTCAACTAT TCAAACCCCA GCGGGGGCGA GCAGTCGTCG TCCCCCGGGG 960 

GGGCGCCGGT CTAACCGGGC GTTCCCGCGT CCGGTCGCGC GTGTGCGCGA AGAGTGAACA 1020 

GGGTGTCAGC AAGCGCGGAC GATCCTCGTG CCGAATTC 1058 
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(2) INFORMATION FOR SEQ ID NO: 46: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 327 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 46: 



CGGCACGAGA 


GACCGATGCC 


GCTACCCTCG 


CGCAGGAGGC 


AGGTAATTTC 


GAGCGGATCT 


60 


CCGGCGACCT 


GAAAACCCAG 


ATCGACCAGG 


TGGAGTCGAC 


GGCAGGTTCG 


TTGCAGGGCC 


120 


AGTGGCGCGG 


CGCGGCGGGG 


ACGGCCGCCC 


AGGCCGCGGT 


GGTGCGCTTC 


CAAGAAGCAG 


180 


CCAATAAGCA 


GAAGCAGGAA 


CTCGACGAGA 


TCTCGACGAA 


TATTCGTCAG 


GCCGGCGTCC 


240 


AATACTCGAG 


GG CC G AC GAG 


GAGCAGCAGC 


AGGCGCTGTC 


CTCGCAAATG 


GGCTTCTGAC 


300 


CCGCTAATAC 


GAAAAGAAAC 


GGAGCAA 








327 



(2) INFORMATION FOR SEQ ID NO: 47: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 170 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 47: 
CGGTCGCGAT GATGGCGTTG TCGAACGTGA CCGATTCTGT ACCGCCGTCG TTGAGATCAA 60 
CCAACAACGT GTTGGCGTCG GCAAATGTGC CGNACCCGTG GATCTCGGTG ATCTTGTTCT 120 
TCTTCATCAG GAAGTGCACA CCGGCCACCC TGCCCTCGGN TACCTTTCGG 17 0 

(2) INFORMATION FOR SEQ ID NO: 48: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 127 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 48: 
GATCCGGCGG CACGGGGGGT GCCGGCGGCA GCACCGCTGG CGCTGGCGGC AACGGCGGGG 
CCGGGGGTGG CGGCGGAACC GGTGGGTTGC TCTTCGGCAA CGGCGGTGCC GGCGGGCACG 
GGGCCGT 

(2) INFORMATION FOR SEQ ID NO: 4 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 81 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 




(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 49: 
CGGCGGCAAG GGCGGCACCG CCGGCAACGG GAGCGGCGCG GCCGGCGGCA ACGGCGGCAA 
CGGCGGCTCC GGCCTCAACG G 
(2) INFORMATION FOR SEQ ID NO: 50: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 14 9 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 50: 
GATCAGGGCT GGCCGGCTCC GGCCAGAAGG GCGGTAACGG AGGAGCTGCC GGATTGTTTG 60 
GCAACGGCGG GGCCGGNGGT GCCGGCGCGT CCAACCAAGC CGGTAACGGC GGNGCCGGCG 120 
GAAACGGTGG TGCCGGTGGG CTGATCTGG 149 
(2) INFORMATION FOR SEQ ID NO: 51: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 355 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
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(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 51: 

CGGCACGAGA TCACACCTAC CGAGTGATCG AGATCGTCGG GACCTCGCCC GACGGTGTCG 60 

ACGCGGNAAT CCAGGGCGGT CTGGCCCGAG CTGCGCAGAC CATGCGCGCG CTGGACTGGT 120 

TCGAAGTACA GTCAATTCGA GGCCACCTGG TCGACGGAGC GGTCGCGCAC TTCCAGGTGA 180 

CTATGAAAGT CGGCTTCCGC CTGGAGGATT CCTGAACCTT CAAGCGCGGC CGATAACTGA 24 0 

GGTGCATCAT TAAGCGACTT TTCCAGAACA TCCTGACGCG CTCGAAACGC GGTTCAGCCG 300 

ACGGTGGCTC CGCCGAGGCG CTGCCTCCAA AATCCCTGCG ACAATTCGTC GGCGG 355 
(2) INFORMATION FOR SEQ ID NO: 52: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 999 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 52: 

ATGCATCACC AT C AC CATC A CATGCATCAG GTGGACCCCA ACTTGACACG TCGCAAGGGA ' 60 

CGATTGGCGG CACTGGCTAT CGCGGCGATG GCCAGCGCCA GCCTGGTGAC CGTTGCGGTG 120 

CCCGCGACCG CCAACGCCGA TCCGGAGCCA GCGCCCCCGG TACCCACAAC GGCCGCCTCG 180 

CCGCCGTCGA CCGCTGCAGC GCCACCCGCA CCGGCGACAC CTGTTGCCCC CCCACCACCG 24 0 

GCCGCCGCCA ACACGCCGAA TGCCCAGCCG GGCGATCCCA ACGCAGCACC TCCGCCGGCC 300 

GACCCGAACG CACCGCCGCC ACCTGTCATT GCCCCAAACG CACCCCAACC TGTCCGGATC 360 

GACAACCCGG TTGGAGGATT CAGCTTCGCG CTGCCTGCTG GCTGGGTGGA GTCTGACGCC 4 20 

GCCCACTTCG ACTACGGTTC AGCACTCCTC AGCAAAACCA CCGGGGACCC GCCATTTCCC 4 80 

GGACAGCCGC CGCCGGTGGC CAATGACACC CGTATCGTGC TCGGCCGGCT AGACCAAAAG 54 0 

CTTTACGCCA GCGCCGAAGC CACCGACTCC AAGGCCGCGG CCCGGTTGGG CTCGGACATG 600 

GGTGAGTTCT ATATGCCCTA CCCGGGCACC CGGATCAACC AGGAAACCGT CTCGCTCGAC 660 
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GCCAACGGGG TGTCTGGAAG CGCGTCGTAT TACGAAGTCA AGTTCAGCGA TCCGAGTAAG 720 

CCGAACGGCC AGATCTGGAC GGGCGTAATC GGCTCGCCCG CGGCGAACGC ACCGGACGCC 780 

GGGCCCCCTC AGCGCTGGTT TGTGGTATGG CTCGGGACCG CCAACAACCC GGTGGACAAG 84 0 

GGCGCGGCCA AGGCGCTGGC CGAATCGATC CGGCCTTTGG TCGCCCCGCC GCCGGCGCCG 900 

GCACCGGCTC CTGCAGAGCC CGCTCCGGCG CCGGCGCCGG CCGGGGAAGT CGCTCCTACC 960 

CCGACGACAC CGACACCGCA GCGGACCTTA CCGGCCTGA 999 
(2) INFORMATION FOR SEQ ID NO: 53: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 332 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

( D) TOPOLOGY : linear 



(xi) SEQUENCE DE 

Met His His His 
1 

Arg Arg Lys Gly 
20 

Ala Ser Leu Val 
35 

Glu Pro Ala Pro 
50 

Ala Ala Ala Pro 
65 

Ala Ala Ala Asn 



Pro Pro Pro Ala 
100 

Asn Ala Pro Gin 
115 

Phe Ala Leu Pro 
130 

Tyr Gly Ser Ala 



ICRIPTION: SEQ I 

His His His Met 
5 

Arg Leu Ala Ala 



Thr Val Ala Val 
40 

Pro Val Pro Thr 
55 

Pro Ala Pro Ala 
70 

Thr Pro Asn Ala 
85 

Asp Pro Asn Ala 



Pro Val Arg lie 
120 

Ala Gly Trp Val 
135 

Leu Leu Ser Lys 



» NO:53: 

His Gin Val Asp 
10 

Leu Ala lie Ala 

25 

Pro Ala Thr Ala 



Thr Ala Ala Ser 
60 

Thr Pro Val Ala 
75 

Gin Pro Gly Asp 
90 

Pro Pro Pro Pro 
105 

Asp Asn Pro Val 



Glu Ser Asp Ala 
140 

Thr Thr Gly Asp 



Pro Asn Leu Thr 
15 

Ala Met Ala Ser 
30 

Asn Ala Asp Pro 
45 

Pro Pro Ser Thr 



Pro Pro Pro Pro 
80 

Pro Asn Ala Ala 
95 

Val He Ala Pro 
110 

Gly Gly Phe Ser 
125 

Ala His Phe Asp 



Pro Pro Phe Pro 
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145 iso 155 



160 



Gly Gin Pro Pro Pro Val Ala Asn Asp Thr Arg lie Val Leu Gly Arg 
165 170 175 

Leu Asp Gin Lys Leu Tyr Ala Ser Ala Glu Ala Thr Asp Ser Lys Ala 
180 i 8 5 lgo 

Ala Ala Arg Leu Gly Ser Asp Met Gly Glu Phe Tyr Met Pro Tyr Pro 
295 200 205 

Gly Thr Arg He Asn Gin Glu Thr Val Ser Leu Asp Ala Asn Gly Val 
210 215 220 

Ser Gly Ser Ala Ser Tyr Tyr Glu Val Lys Phe Ser Asp Pro Ser Lys 
225 230 235 2 y 40 

Pro Asn Gly Gin He Trp Thr Gly Val He Gly Ser Pro Ala Ala Asn 
245 250 255 

Ala Pro Asp Ala Gly Pro Pro Gin Arg Trp Phe Val Val Trp l/u Glv 
260 265 270 

Thr Ala Asn Asn Pro Val Asp Lys Gly Ala Ala Lys Ala Leu Ala Glu 
275 280 285 

Ser lie Arg Pro Leu Val Ala Pro Pro Pro Ala Pro Ala Pro Ala Pro 
290 295 300 

Ala Glu Pro Ala Pro Ala Pro Ala Pro Ala Gly Glu Val Ala Pro Thr 

310 315 32Q 

Pro Thr Thr Pro Thr Pro Gin Arg Thr Leu Pro Ala 
325 330 

(2) INFORMATION FOR SEQ ID NO: 54: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 54: 

Asp Pro Val Asp Ala Val He Asn Thr Thr Xaa Asn Tyr Gly Gin Val 
5 10 15 

Val Ala Ala Leu 
20 
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TYPE: amino acid 
C) STRANDEDNESS: 
D TOPOLOGY: linear 



M . v., - - f v »« - Ma ~ S" 



5 

1 



(B) TYPE: amino acid 
/r) STRANDEDNESS: 
; D) TOPOLOGY: linear 



(ltl) seqo.ce « » ^ „ R1 . „. w 



1 

Glu Gly Arg 



(2) INFORMATION FOR SEQ 10 NO: 57: 
, • , seoOENCE CHARACTERISTICS : 
U) S S\eNGTH: 15 amino acids 

( B ) TYPE: amino acid 

in) STRANDEDNESS: 

(D) TOPOLOGY: linear 

•» id ^; m Rla Ttp a> „ 

Tyr TVI «P C». « -V «» Pro Phe 
, 2 , IBFO R»TIO» FOR SEQ 10 S 0:58: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 14 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 58 : 

Asp lie Gly Ser Glu Ser Thr Glu Asp Gin Gin Xaa Ala Val 
1 ' 5 10 

(2) INFORMATION FOR SEQ ID NO: 59: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 59: 

Ala Glu Glu Ser lie Ser Thr Xaa Glu Xaa lie Val Pro 
15 10 

(2) INFORMATION FOR SEQ ID NO: 60: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 60: 

Asp Pro Glu Pro Ala Pro Pro Val Pro Thr Ala Ala Ala Ala Pro Pro 
15 10 15 



Ala 



(2) INFORMATION FOR SEQ ID NO: 61: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 61: 

Ala Pro Lys Thr Tyr Xaa Glu Glu Leu Lys Gly Thr Asp Thr Gly 
1 5 10 15 

(2) INFORMATION FOR SEQ ID NO: 62: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 62: 

Asp Pro Ala Ser Ala Pro Asp Val Pro Thr Ala Ala Gin Gin Thr Ser 
1 5 10 15 

Leu Leu Asn Asn Leu Ala Asp Pro Asp Val Ser Phe Ala Asp 
20 25 30 

(2) INFORMATION FOR SEQ ID NO: 63: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 24 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 63: 

Gly Cys Gly Asp Arg Ser Gly Gly Asn Leu Asp Gin He Arg Leu Arg 
1 5 10 15 

Arg Asp Arg Ser Gly Gly Asn Leu 
20 
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(2) INFORMATION FOR SEQ ID NO: 64: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 187 amino acids 
<B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 64: 

Thr Gly Ser Leu Asn Gin Thr His Asn Arg Arg Ala Asn Glu Arg Lys 
15 10 15 

Asn Thr Thr Met Lys Met Val Lys Ser lie Ala Ala Gly Leu Thr Ala 
20 25 30 I 

Ala Ala Ala He Gly Ala Ala Ala Ala Gly Val Thr Ser He Met Ala 
35 40 45 

Gly Gly Pro Val Val Tyr Gin Met Gin Pro Val Val Phe Gly Ala Pro 
50 55 60 

Leu Pro Leu Asp Pro Ala Ser Ala Pro Asp Val Pro Thr Ala Ala Gin 
65 70 75 80 

Leu Thr Ser Leu Leu Asn Ser Leu Ala Asp Pro Asn Val Ser Phe Ala 
85 90 95 

Asn Lys Gly Ser Leu Val Glu Gly Gly He Gly Gly Thr Glu Ala Arg 
100 105 HO 

He Ala Asp His Lys Leu Lys Lys Ala Ala Glu His Gly Asp Leu Pro 
115 120 125 

Leu Ser Phe Ser Val Thr Asn He Gin Pro Ala Ala Ala Gly Ser Ala 
130 135 140 

Thr Ala Asp Val Ser Val Ser Gly Pro Lys Leu Ser Ser Pro Val Thr 
145 150 155 160 

Gin Asn Val Thr Phe Val Asn Glh Gly Gly Trp Met Leu Ser Arg Ala 
165 170 175 

Ser Ala Met Glu Leu Leu Gin Ala Ala Gly Xaa 
180 185 

(2) INFORMATION FOR SEQ ID NO: 65: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 14 8 amino acids 

(B) TYPE: amino acid 
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(C) STRANDEDNESS: single 
( D ) TOPOLOGY : linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 65: 

Asp Glu Val Thr Val Glu Thr Thr Ser Val Phe Arg Ala Asp Phe Leu 
1 5 10 15 

Ser Glu Leu Asp Ala Pro Ala Gin Ala Gly Thr Glu Ser Ala Val Ser 
20 25 30 

Gly Val Glu Gly Leu Pro Pro Gly Ser Ala Leu Leu Val Val Lys Arg 
35 40 45 

Gly Pro Asn Ala Gly Ser Arg Phe Leu Leu Asp Gin Ala lie Thr Ser 
50 55 60 

Ala Gly Arg His Pro Asp Ser Asp He Phe Leu Asp Asp Val Thr Val 
65 ~ " 70 75 80 

Ser Arg Arg His Ala Glu Phe Arg Leu Glu Asn Asn Glu Phe Asn Val 
85 90 95 

Val Asp Val Gly Ser Leu Asn Gly Thr Tyr Val Asn Arg Glu Pro Val 
100 105 HO 

Asp Ser Ala Val Leu Ala Asn Gly Asp Glu Val Gin He Gly Lys Leu 
115 120 125 

Arg Leu Val Phe Leu Thr Gly Pro Lys Gin Gly Glu Asp Asp Gly Ser 
130 135 140 

Thr Gly Gly Pro 
145 

(2) INFORMATION FOR SEQ ID NO: 66: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 230 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 66: 

Thr Ser Asn Arg Pro Ala Arg Arg Gly Arg Arg Ala Pro Arg Asp Thr 
1 5 10 15 
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Gly Pro Asp Arg Ser Ala Ser Leu Ser Leu Val Arg His Arg Arg Gin 

25 30 
Gin Arg Asp Ala Leu Cys Leu Ser Ser Thr Gin He Ser Arg Gin Ser 



40 



45 



Asn Leu Pro Pro Ala Ala Gly Gly Ala Ala Asn Tyr Ser Arg Arg Asn 

Phe Asp Val Arg He Lys He Phe Met Leu Val Thr Ala Val Val Leu 

70 75 . 80 

Leu Cys Cys Ser Gly Val Ala Thr Ala Ala Pro Lys Thr Tyr Cys Glu 

85 90 95 

Glu Leu Lys Gly Thr Asp Thr Gly Gin Ala Cys Gin He Gin Met Ser 
100 110 :/ 

Asp Pro Ala Tyr Asn lie Asn He Ser Leu Pro Ser Tyr Tyr Pr^ Asp 

115 120 125 . 

Gin Lys Ser Leu Glu Asn Tyr He Ala Gin Thr Arg Asp Lys Phe Leu 
■ Ljyj 135 - • ~ 



140 



Ser Ala Ala Thr Ser Ser Thr Pro Arg Glu Ala Pro Tyr Glu Leu Asn 

150 i55 160 

He Thr Ser Ala Thr Tyr Gin Ser Ala He Pro Pro Arg Gly Thr Gin 

1°5 nn 



160 

Pro Pro Arg Gly Thr 
170 175 

Ala Val Val Leu Xaa Val Tyr His Asn Ala Gly Gly Thr Hi £ 



180 185 



ls Pro Thr 
190 



Thr Thr Tyr Lys Ala Phe Asp Trp Asp Gin Ala Tyr Arg Lys Pro He 
yb 2°0 205 

Thr Tyr Asp Thr Leu Trp Gin Ala Asp Thr Asp Pro Leu Pro Val Val 

215 220 

Phe Pro He Val Ala Arg 
225 230 

(2) INFORMATION FOR SEQ ID NO: 67: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 132 amino acids 

(B) TYPE: 

amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 67: 

Thr Ala Ala Ser Asp Asn Phe Gin Leu Ser Gin Gly Gly Gin Gly Phe 
15 10 15 

Ala lie Pro He Gly Gin Ala Met Ala He Ala Gly Gin He Arg Ser 
20 " 25 30 

Gly Gly Gly Ser Pro Thr Val His He Gly Pro Thr Ala Phe Leu Gly 
35 40 45 

Leu Gly Val Val Asp Asn Asn Gly Asn Gly Ala Arg Val Gin Arg Val 
50 55 60 

Val Gly Ser Ala Pro Ala Ala Ser Leu Gly He Ser Thr Gly Asp Val 
65 ' 70 75 ^ 80 

lie Thr Ala Val Asp Gly Ala Pro He Asn Ser Ala Thr Ala Met Ala 
85 90 95 

Asp Ala Leu Asn Gly His His Pro Gly Asp Val He Ser Val Asn Trp 
100 105 HO 

Gin Thr Lys Ser Gly Gly Thr Arg Thr Gly Asn Val Thr Leu Ala Glu 
115 120 125 

Gly Pro Pro Ala 
130 

(2) INFORMATION FOR SEQ ID NO: 68: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 100 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 68: 

Val Pro Leu Arg Ser Pro Ser Met Ser Pro Ser Lys Cys Leu Ala Ala 
1 5 10 15 . 

Ala Gin Arg Asn Pro Val He Arg Arg Arg Arg Leu Ser Asn Pro Pro 
20 25 30 

Pro Arg Lys Tyr Arg Ser Met Pro Ser Pro Ala Thr Ala Ser Ala Gly 

35 40 45 

Met Ala Arg Val Arg Arg Arg Ala He Trp Arg Gly Pro Ala Thr Xaa 
50 55 60 
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Ser Ala Gly Met Ala Arg Val Arg Arg Trp Xaa Val Met Pro Xaa Val 



65 70 75 



80 



lie Gin Ser Thr Xaa lie Arg Xaa Xaa Gly Pro Phe Asp Asn Arg Gly 
85 go 95 

Ser Glu Arg Lys 
100 

(2) INFORMATION FOR SEQ ID NO: 69: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 163 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 69: 

Met Thr Asp Asp He Leu Leu He Asp Thr Asp Glu Arg Val Arg Thr 
1 5 10 15 

Leu Thr Leu Asn Arg Pro Gin Ser Arg Asn Ala Leu Ser Ala Ala Leu 
20 25 30 

Arg Asp Arg Phe Phe Ala Xaa Leu Xaa Asp Ala Glu Xaa Asp Asp Asp 
35 40 45 

He Asp Val Val He Leu Thr Gly Ala Asp Pro Val Phe Cys Ala Glv 
50 55 60 



Leu Asp Leu Lys Val Ala Gly Arg Ala Asp Arg Ala Ala Gly His Leu 

80 



65 70 75 



Thr Ala Val Gly Gly His Asp Gin Ala Gly Asp Arg Arg Asp Gin Aro 
85 90 95 

Arg Arg Gly His Arg Arg Ala Arg Thr Gly Ala Val Leu Arg His Pro 
100 105 no 

Asp Arg Leu Arg Ala Arg Pro Leu Arg Arg His Pro Arg Pro Gly Gly 
115 120 125 

Ala Ala Ala His Leu Gly Thr Gin Cys Val Leu Ala Ala Lys Gly Aro 
13° 135 140 

His Arg Xaa Gly Pro Val Asp Glu Pro Asp Arg Arg Leu Pro Val Aro 
145 150 155 lee* 

Asp Arg Arg 
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(2) INFORMATION FOR SEQ ID NO: 70: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 344 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 70: 

Met Lys Phe Val Asn His lie Glu Pro Val Ala Pro Arg Arg Ala Gly 
1 5 10 15 

Gly Ala Val Ala Glu Val Tyr Ala Glu Ala Arg Arg Glu Phe Gly Arg 
20 25 30 

Leu Pro Glu Pro Leu Ala Met Leu Ser Pro Asp Glu Gly Leu Leu Thr 
35 40 45 

Ala Gly Trp Ala Thr Leu Arg Glu Thr Leu Leu Val Gly Gin Val Pro 
50 55 60 

Arg Gly Arg Lys Glu Ala Val Ala Ala Ala Val Ala Ala Ser Leu Arg 
65 70 75 80 

Cys Pro Trp Cys Val Asp Ala His Thr Thr Met Leu Tyr Ala Ala Gly 
8 5 90 95 

Gin Thr Asp Thr Ala Ala Ala lie Leu Ala Gly Thr Ala Pro Ala Ala 
100 105 HO 

Gly Asp Pro Asn Ala Pro Tyr Val Ala Trp Ala Ala Gly Thr Gly Thr 
115 120 125 

Pro Ala Gly Pro Pro Ala Pro Phe Gly Pro Asp Val Ala Ala Glu Tyr 
130 135 140 

Leu Gly Thr Ala Val Gin Phe His Phe lie Ala Arg Leu Val Leu Val 
145 150 155 160 

Leu Leu Asp Glu Thr Phe Leu Pro Gly Gly Pro Arg Ala Gin Gin Leu 
165 170 175 

Met Arg Arg Ala Gly Gly Leu Val Phe Ala Arg Lys Val Arg Ala Glu 
130 185 190 

His Arg Pro Gly Arg Ser Thr Arg Arg Leu Glu Pro Arg Thr Leu Pro 
195 200 205 

Asp Asp Leu Ala Trp Ala Thr Pro Ser Glu Pro lie Ala Thr Ala Phe 
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210 215 220 

Ala Ala Leu Ser His His Leu Asp Thr Ala Pro His Leu Pro Pro Pro 
225 230 235 240 

Thr Arg Gin Val Val Arg Arg Val Val Gly Ser Trp His Gly Glu Pro 
245 250 255 

Met Pro Met Ser Ser Arg Trp Thr Asn Glu His Thr Ala Glu Leu Pro 
260 265 270 

Ala Asp Leu His Ala Pro Thr Arg Leu Ala Leu Leu Thr Gly Leu Ala 
275 280 285 

Pro His Gin Val Thr Asp Asp Asp Val Ala Ala Ala Arg Ser Leu Leu 
290 295 300 

Asp Thr Asp Ala Ala Leu Val Gly Ala Leu Ala Trp Ala Ala Phe Thr 
305 310 315 / 320 



Ala Ala Arg Arg He Gly Thr Trp He Gly Ala Ala Ala Glu Gly Gin 
325 330 335 

Val Ser Arg Gin Asn Pro Thr Gly 
340 

(2) INFORMATION FOR SEQ ID NO: 71: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 85 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 71: 

Asp Asp Pro Asp Met Pro Gly Thr Val Ala Lys Ala Val Ala Asp Ala 
1 5 10 15 

Leu Gly Arg Gly He Ala Pro Val Glu Asp He Gin Asp Cys Val Glu 
20 25 30 

Ala Arg Leu Gly Glu Ala Gly Leu Asp Asp Val Ala Arg Val Tyr He 
35 4 0 4 5 

lie Tyr Arg Gin Arg Arg Ala Glu Leu Arg Thr Ala Lys Ala Leu Leu 
50 55 60 

Gly Val Arg Asp Glu Leu Lys Leu Ser Leu Ala Ala Val Thr Val Leu 
65 70 75 80 
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Arg Glu Arg Tyr Leu Leu His Asp Glu Gin Gly Arg Pro Ala Glu Ser 
85 90 95 

Thr Gly Glu Leu Met Asp Arg Ser Ala Arg Cys Val Ala Ala Ala Glu 
100 105 HO 

Asp Gin Tyr Glu Pro Gly Ser Ser Arg Arg Trp Ala Glu Arg Phe Ala 
115 120 125 

Thr Leu Leu Arg Asn Leu Glu Phe Leu Pro Asn Ser Pro Thr Leu Met 
130 135 140 

Asn Ser Gly Thr Asp Leu Gly Leu Leu Ala Gly Cys Phe Val Leu Pro 
145 " 150 155 160 

He Glu Asp Ser Leu Gin Ser He Phe Ala Thr Leu Gly Gin Ala Ala 
165 170 175 

Glu Leu Gin Arg Ala Gly Gly Gly Thr Gly Tyr Ala Phe Ser His Leu 
180 185 190 

Arg Pro Ala Gly Asp Arg Val Ala Ser Thr Gly Gly Thr Ala Ser Gly 
195 200 205 

Pro Val Ser Phe Leu Arg Leu Tyr Asp Ser Ala Ala Gly Val Val Ser 
210 215 220 

Met Gly Gly Arg Arg Arg Gly Ala Cys Met Ala Val Leu Asp Val Ser 
225 230 235 240 

His Pro Asp He Cys Asp Phe Val Thr Ala Lys Ala Glu Ser Pro Ser 
245 250 255 

Glu Leu Pro His Phe Asn Leu Ser Val Gly Val Thr Asp Ala Phe Leu 
260 265 270 

Arg Ala Val Glu Arg Asn Gly Leu His Arg Leu Val Asn Pro Arg Thr 
275 280 285 

Gly Lys He Val Ala Arg Met Pro Ala Ala Glu Leu Phe Asp Ala He 
290 295 300 

Cys Lys Ala Ala His Ala Gly Gly Asp Pro Gly Leu Val Phe Leu Asp 
305 310 315 320 

Thr He Asn Arg Ala Asn Pro Val Pro Gly Arg Gly Arg He Glu Ala 
325 330 335 

Thr Asn Pro Cys Gly Glu Val Pro Leu Leu Pro Tyr Glu Ser Cys Asn 
340 345 350 

Leu Gly Ser He Asn Leu Ala Arg Met Leu Ala Asp Gly Arg Val Asp 
355 360 365 

Trp Asp Arg Leu Glu Glu Val Ala Gly Val Ala Val Arg Phe Leu Asp 



BNSDOCID: <WO 9816645A2J_> 



WO 98/16645 



PCT7US97/18214 



106 



370 375 380 

Asp Val lie Asp Val Ser Arg Tyr Pro Phe Pro Glu Leu Gly Glu Ala 
385 390 395 400 

Ala Arg Ala Thr Arg Lys lie Gly Leu Gly Val Met Gly Leu Ala Glu 
405 410 415 

Leu Leu Ala Ala Leu Gly lie Pro Tyr Asp Ser Glu Glu Ala Val Arg 
420 425 430 

Leu Ala Thr Arg Leu Met Arg Arg He Gin Gin Ala Ala His Thr Ala 
435 440 445 

Ser Arg Arg Leu Ala Glu Glu Arg Gly Ala Phe Pro Ala Phe Thr Asp 
450 455 460 

Ser Arg Phe Ala Arg Ser Gly Pro Arg Arg Asn Ala Gin Val Thr Ser 
465 " 470 475 480 

Val Ala Pro Thr Gly 
485 

(2) INFORMATION FOR SEQ ID NO: 72: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 67 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 72: 

Gly Val He Val Leu Asp Leu Glu Pro Arg Gly Pro Leu Pro Thr Glu 
15 10 15 

He Tyr Trp Arg Arg Arg Gly Leu Ala Leu Gly He Ala Val Val Val 
20 25 30 

Val Gly He Ala Val Ala He Val He Ala Phe Val Asp Ser Ser Ala 
35 40 45 

Gly Ala Lys Pro Val Ser Ala Asp Lys Pro Ala Ser Ala Gin Ser His 
50 ' 55 60 

Pro Gly Ser Pro Ala Pro Gin Ala Pro Gin Pro Ala Gly Gin Thr Glu 
65 70 75 80 

Gly Asn Ala Ala Ala Ala Pro Pro Gin Gly Gin Asn Pro Glu Thr Pro 
85 90 95 
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Thr Pro Thr Ala Ala Val Gin Pro Pro Pro Val Leu Lys Glu Gly Asp 
100 105 110 

Asp Cys Pro Asp Ser Thr Leu Ala Val Lys Gly Leu Thr Asn Ala Pro 
115 120 125 

Gin Tyr Tyr Val Gly Asp Gin Pro Lys Phe Thr Met Val Val Thr Asn 
130 135 140 

lie Gly Leu Val Ser Cys Lys Arg Asp Val Gly Ala Ala Val Leu Ala 
145 150 155 160 

Ala Tyr Val Tyr Ser Leu Asp Asn Lys Arg Leu Trp Ser Asn Leu Asp 
165 170 175 

Cys Ala Pro Ser Asn Glu Thr Leu Val Lys Thr Phe Ser Pro Gly Glu 
180 185 190 / 

Gin Val Thr Thr Ala Val Thr Trp Thr Gly Met Gly Ser Ala Pro Arg 
195 200 205 

Cys Pro Leu Pro Arg Pro Ala lie Gly Pro Gly Thr Tyr Asn Leu Val 
210 215 220 

Val Gin Leu Gly Asn Leu Arg Ser Leu Pro Val Pro Phe lie Leu Asn 
225 230 235 240 

Gin Pro Pro Pro Pro Pro Gly Pro Val Pro Ala Pro Gly Pro Ala Gin 
245 250 255 

Ala Pro Pro Pro Glu Ser Pro Ala Gin Gly Gly 
260 265 

(2) INFORMATION FOR SEQ ID NO: 73: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 97 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 73: 

Leu He Ser Thr Gly Lys Ala Ser His Ala Ser Leu Gly Val Gin Val 
15 10 15 

Thr Asn Asp Lys Asp Thr Pro Gly Ala Lys He Val Glu Val Val Ala 
20 25 30 

Gly Gly Ala Ala Ala Asn Ala Gly Val Pro Lys Gly Val Val Val Thr 

35 40 4 5 
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Lys Val Asp Asp Arg Pro He Asn Ser Ala Asp Ala Leu Val Ala Ala 
50 55 60 

Val Arg Ser Lys Ala Pro Gly Ala Thr Val Ala Leu Thr Phe Gin Asd 
65 — ^ 



70 



75 



80 



Pro Ser Gly Gly Ser Arg Thr Val Gin Val Thr Leu Gly Lys Ala Glu 
85 90 95 



Gin 



(2) INFORMATION FOR SEQ ID NO: 74: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 364 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO.-74: 

Gly Ala Ala Val Ser Leu Leu Ala Ala Gly Thr Leu Val Leu Thr Ala 
1 5 10 15 

Cys Gly Gly Gly Thr Asn Ser Ser Ser Ser Gly Ala Gly Gly Thr Ser 
20 25 30 

Gly Ser Val His Cys Gly Gly Lys Lys Glu Leu His Ser Ser Gly Ser 
35 40 45 

Thr Ala Gin Glu Asn Ala Met Glu Gin Phe Val Tyr Ala Tyr Val Ara 
50 55 60 

Ser Cys Pro Gly Tyr Thr Leu Asp Tyr Asn Ala Asn Gly Ser Gly Ala 



65 70 



75 80 



Gly Val Thr Gin Phe Leu Asn Asn Glu Thr Asp Phe Ala Gly Ser Asd 
85 90 95 

Val Pro Leu Asn Pro Ser Thr Gly Gin Pro Asp Arg Ser Ala Glu Arg 
100 105 no 

Cys Gly Ser Pro Ala Trp Asp Leu Pro Thr Val Phe Gly Pro He Ala 
H5 120 125 

lie Thr Tyr Asn He Lys Gly Val Ser Thr Leu Asn Leu Asp Gly Pro 
I 30 135 i 4 o 

Thr Thr Ala Lys lie Phe Asn Gly Thr He Thr Val Trp Asn Asp Pro 
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145 



150 155 160 



Gin lie Gin Ala Leu Asn Ser Gly Thr Asp Leu Pro Pro Thr Pro lie 
165 170 175 

Ser Val He Phe Arg Ser Asp Lys Ser Gly Thr Ser Asp Asn Phe Gin 
180 185 190 

Lys Tyr Leu Asp Gly Val Ser Asn Gly Ala Trp Gly Lys Gly Ala Ser 
195 200 205 

Glu Thr Phe Ser Gly Gly Val Gly Val Gly Ala Ser Gly Asn Asn Gly 
210 215 220 

Thr Ser Ala Leu Leu Gin Thr Thr Asp Gly Ser He Thr Tyr Asn Glu 
225 230 235 240 

Trp Ser Phe Ala Val Gly Lys Gin Leu Asn Met Ala Gin He He Thr 
24 5 250 255 

Ser Ala Gly Pro Asp Pro Val Ala He Thr Thr Glu Ser Val Gly Lys 
260 265 270 

Thr He Ala Gly Ala Lys He Met Gly Gin Gly Asn Asp Leu Val Leu 
275 " 280 285 

Asp Thr Ser Ser Phe Tyr Arg Pro Thr Gin Pro Gly Ser Tyr Pro He 
290 295 300 

Val Leu Ala Thr Tyr Glu He Val Cys Ser Lys Tyr Pro Asp Ala Thr 
305 310 315 320 

Thr Gly Thr Ala Val Arg Ala Phe Met Gin Ala Ala He Gly Pro Gly 
325 330 335 

Gin Glu Gly Leu Asp Gin Tyr Gly Ser He Pro Leu Pro Lys Ser Phe 
340 345 350 

Gin Ala Lys Leu Ala Ala Ala Val Asn Ala He Ser 
355 360 

(2) INFORMATION FOR SEQ ID NO: 75: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 309 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 75: 
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Gin Ala Ala Ala Gly Arg Ala Val Arg Arg Thr Gly His Ala Glu Asp 
15 10 15 

Gin Thr His Gin Asp Arg Leu His His Gly Cys Arg Arg Ala Ala Val 
20 25 30 

Val Val Arg Gin Asp Arg Ala Ser Val Ser Ala Thr Ser Ala Arg Pro 
35 40 45 

Pro Arg Arg His Pro Ala Gin Gly His Arg Arg Arg Val Ala Pro Ser 
50 55 60 

Gly Gly Arg Arg Arg Pro His Pro His His Val Gin Pro Asp Asp Arg 
65 70 75 80 

Arg Asp Arg Pro Ala Leu Leu Asp Arg Thr Gin Pro Ala Glu His Pro 
85 90 95 

/ 

/ 

Asp Pro His Arg Arg Gly Pro Ala Asp Pro Gly Arg Val Arg Gl'y Arg 
100 105 HO ' 

Gly Arg Leu Arg Arg Val Asp Asp Gly Arg Leu Gin Pro Asp Arg Asp 
115 120 125 

Ala Asp His Gly Ala Pro Val Arg Gly Arg Gly Pro His Arg Gly Val 
130 135 140 

Gin His Arg Gly Gly Pro Val Phe Val Arg Arg Val Pro Gly Val Arg 
145 150 155 160 

Cys Ala His Arg Arg Gly His Arg Arg Val Ala Ala Pro Gly Gin Gly 
165 170 175 

Asp Val Leu Arg Ala Gly Leu Arg Val Glu Arg Leu Arg Pro Val Ala 
180 185 190 

Ala Val Glu Asn Leu His Arg Gly Ser Gin Arg Ala Asp Gly Arg Val 
195 200 205 

Phe Arg Pro He Arg Arg Gly Ala Arg Leu Pro Ala Arg Arg Ser Arg 
210 215 220 

Ala Gly Pro Gin Gly Arg Leu His Leu Asp Gly Ala Gly Pro Ser Pro 
225 " 230 235 240 

Leu Pro Ala Arg Ala Gly Gin Gin Gin Pro Ser Ser Ala Gly Gly Arg 
245 250 255 

Arg Ala Gly Gly Ala Glu Arg Ala Asp Pro Gly Gin Arg Gly Arg His 
260 265 270 

His Gin Gly Gly His Asp Pro Gly Arg Gin Gly Ala Gin Arg Gly Thr 
275 280 285 

Ala Gly Val Ala His Ala Ala Ala Gly Pro Arg Arg Ala Ala Val Arg 
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290 



295 



300 



Asn Arg Pro Arg Arg 
305 



(2) INFORMATION FOR SEQ ID NO: 76: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 580 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 76: 

Ser Ala Val Trp Cys Leu Asn Gly Phe Thr Gly Arg His Arg His Gly 
1 5 10 15 

Arg Cys Arg Val Arg Ala Ser Gly Trp Arg Ser Ser Asn Arg Trp Cys 
20 25 30 

Ser Thr Thr Ala Asp Cys Cys Ala Ser Lys Thr Pro Thr Gin Ala Ala 
35 40 45 

Ser Pro Leu Glu Arg Arg Phe Thr Cys Cys Ser Pro Ala Val Gly Cys 
50 55 60 

Arg Phe Arg Ser Phe Pro Val Arg Arg Leu Ala Leu Gly Ala Arg Thr 
65 70 75 80 

Ser Arg Thr Leu Gly Val Arg Arg Thr Leu Ser Gin Trp Asn Leu Ser 
8 5 90 95 

Pro Arg Ala Gin Pro Ser Cys Ala Val Thr Val Glu Ser His Thr His 
100 105 110 

Ala Ser Pro Arg Met Ala Lys Leu Ala Arg Val Val Gly Leu Val Gin 
115 120 125 

Glu Glu Gin Pro Ser Asp Met Thr Asn His Pro Arg Tyr Ser Pro Pro 
130 135 140 

Pro Gin Gin Pro Gly Thr Pro Gly Tyr Ala Gin Gly Gin Gin Gin Thr 
145 150 155 160 

Tyr Ser Gin Gin Phe Asp Trp Arg Tyr Pro Pro Ser Pro Pro Pro Gin 
165 170 175 

Pro Thr Gin Tyr Arg Gin Pro Tyr Glu Ala Leu Gly Gly Thr Arg Pro 
180 185 190 
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Gly Leu He Pro Gly Val He Pro Thr Met Thr Pro Pro Pro Gly Met 
195 200 205 

Val Arg Gin Arg Pro Arg Ala Gly Met Leu Ala He Gly Ala Val Thr 
210 215 220 

He Ala Val Val Ser Ala Gly He Gly Gly Ala Ala Ala Ser Leu Val 
225 230 235 240 

Gly Phe Asn Arg Ala Pro Ala Gly Pro Ser Gly Gly Pro Val Ala Ala 
245 250 255 

Ser Ala Ala Pro Ser He Pro Ala Ala Asn Met Pro Pro Gly Ser Val 
260 265 270 

Glu Gin Val Ala Ala Lys Val Val Pro Ser Val Val Met Leu Glu Thr 
275 280 285 

Asp Leu Gly Arg Gin Ser Glu Glu Gly Ser Gly He He Leu Ser Ala 
290 295 300 

Glu Gly Leu He Leu Thr Asn Asn His Val He Ala Ala Ala Ala Lys 
305 310 315 320 

Pro Pro Leu Gly Ser Pro Pro Pro Lys Thr Thr Val Thr Phe Ser Asp 
325 330 335 

Gly Arg Thr Ala Pro Phe Thr Val Val Gly Ala Asp Pro Thr Ser Asp 
340 345 350 

He Ala Val Val Arg Val Gin Gly Val Ser Gly Leu Thr Pro He Ser 
355 360 365 

Leu Gly Ser Ser Ser Asp Leu Arg Val Gly Gin Pro Val Leu Ala He 
370 375 380 

Gly Ser Pro Leu Gly Leu Glu Gly Thr Val Thr Thr Gly He Val Ser 
385 390 395 400 

Ala Leu Asn Arg Pro Val Ser Thr Thr Gly Glu Ala Gly Asn Gin Asn 
405 410 415 

Thr Val Leu Asp Ala He Gin Thr Asp Ala Ala He Asn Pro Gly Asn 
420 425 430 

Ser Gly Gly Ala Leu Val Asn Met Asn Ala Gin Leu Val Gly Val Asn 
435 440 445 

Ser Ala He Ala Thr Leu Gly Ala Asp Ser Ala Asp Ala Gin Ser Gly 
450 455 460 

Ser He Gly Leu Gly Phe Ala He Pro Val Asp Gin Ala Lys Arg He 
465 470 475 480 

Ala Asp Glu Leu He Ser Thr Gly Lys Ala Ser His Ala Ser Leu Gly 
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485 

Val Gin Val Thr Asn 
500 

Val Val Ala Gly Gly 
515 

Val Val Thr Lys Val 
530 

Val Ala Ala Val Arg 
545 

Phe Gin Asp Pro Ser 
565 

Lys Ala Glu Gin 
580 



490 

Asp Lys Asp Thr Pro Gly 
505 

Ala Ala Ala Asn Ala Gly 
520 

Asp Asp Arg Pro lie Asn 
535 

Ser Lys Ala Pro Gly Ala 
550 555 

Gly Gly Ser Arg Thr Val 
570 



495 

Ala Lys lie Val Glu 
510 

Val Pro Lys Gly Val 
525 

Ser Ala Asp Ala Leu 
540 

Thr Val Ala Leu Thr 
560 

Gin Val Thr Leu Gly 
575 



(2) INFORMATION FOR SEQ ID NO: 77: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 233 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 77: 

Met Asn Asp Gly Lys Arg Ala Val Thr Ser Ala Val Leu Val Val Leu 
15 10 15 

Gly Ala Cys Leu Ala Leu Trp Leu Ser Gly Cys Ser Ser Pro Lys Pro 
20 25 30 

Asp Ala Glu Glu Gin Gly Val Pro Val Ser Pro Thr Ala Ser Asp Pro 
35 40 45 

Ala Leu Leu Ala Glu He Arg Gin Ser Leu Asp Ala Thr Lys Gly Leu 
50 55 60 

Thr Ser Val His Val Ala Val Arg Thr Thr Gly Lys Val Asp Ser Leu 
65 70 75 80 

Leu Gly He Thr Ser Ala Asp Val Asp Val Arg Ala Asn Pro Leu Ala 
85 90 95 

Ala Lys Gly Val Cys Thr Tyr Asn Asp Glu Gin Gly Val Pro Phe Arg 
100 105 HO 
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Val Gin Gly Asp Asn He Ser Val Lys Leu Phe Asp Asp Trp Ser Asn 
115 120 125 

Leu Gly Ser He Ser Glu Leu Ser Thr Ser Arg Val Leu Asp Pro Ala 
130 135 140 

Ala Gly Val Thr Gin Leu Leu Ser Gly Val Thr Asn Leu Gin Ala Gin 
145 " 150 155 160 

Gly Thr Glu Val He Asp Gly He Ser Thr Thr Lys He Thr Gly Thr 
165 170 175 

He Pro Ala Ser Ser Val Lys Met Leu Asp Pro Gly Ala Lys Ser Ala 
180 185 190 

Arg Pro Ala Thr Val Trp He Ala Gin Asp Gly Ser His His Leu Val 
195 200 205 

.7 

Arg Ala Ser He Asp Leu Gly Ser Gly Ser He Gin Leu Thr Gin Ser 
210 215 220 / 

Lys Trp Asn Glu Pro Val Asn Val Asp 
225 230 

(2) INFORMATION FOR SEQ ID NO: 78: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 66 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 78: 

Val He Asp He He Gly Thr Ser Pro Thr Ser Trp Glu Gin Ala Ala 
1 5 10 15 

Ala Glu Ala Val Gin Arg Ala Arg Asp Ser Val Asp Asp He Arg Val 
20 25 30 

Ala Arg Val He Glu Gin Asp Met Ala Val Asp Ser Ala Gly Lys He 
35 40 45 

Thr Tyr Arg He Lys Leu Glu Val Ser Phe Lys Met Arg Pro Ala Gin 
50 55 60 

Pro Arg 
65 

(2) INFORMATION FOR SEQ ID NO : 7 9 : 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 69 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 79: 

Val Pro Pro Ala Pro Pro Leu Pro Pro Leu Pro Pro Ser Pro lie Ser 
1 5 10 15 

Cvs Ala Ser Pro Pro Ser Pro Pro Leu Pro Pro Ala Pro Pro Val Ala 
20 25 30 j 

Pro Gly Pro Pro Met Pro Pro Leu Asp Pro Trp Pro Pro Ala Pro Pro 
35 40 45 

Leu Pro Tyr Ser Thr Pro Pro Gly Ala Pro Leu Pro Pro Ser Pro Pro 
50 55 60 

Ser Pro Pro Leu Pro 
65 

(2) INFORMATION FOR SEQ ID NO: 80: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 355 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 80: 

Met Ser Asn Ser Arg Arg Arg Ser Leu Arg Trp Ser Trp Leu Leu Ser 
I 5 10 15 

Val Leu Ala Ala Val Gly Leu Gly Leu Ala Thr Ala Pro Ala Gin Ala 
20 25 30 

Ala Pro Pro Ala Leu Ser Gin Asp Arg Phe Ala Asp Phe Pro Ala Leu 
35 40 45 

Pro Leu Asp Pro Ser Ala Met Val Ala Gin Val Ala Pro Gin Val Val 
50 55 60 

Asn He Asn Thr Lys Leu Gly Tyr Asn Asn Ala Val Gly Ala Gly Thr 
65 70 75 80 
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Gly He Val He Asp Pro Asn Gly Val Val Leu Thr Asn Asn His Val 
85 90 95 

He Ala Gly Ala Thr Asp He Asn Ala Phe Ser Val Gly Ser Gly Gin 
100 105 110 

Thr Tyr Gly Val Asp Val Val Gly Tyr Asp Arg Thr Gin Asp Val Ala 
115 120 125 

Val Leu Gin Leu Arg Gly Ala Gly Gly Leu Pro Ser Ala Ala He Gly 
130 135 140 

Gly Gly Val Ala Val Gly Glu Pro Val Val Ala Met Gly Asn Ser Gly 
145 150 155 160 

Gly Gin Gly Gly Thr Pro Arg Ala Val Pro Gly Arg Val Val Ala Leu 
165 1*70 175 

Gly Gin Thr Val Gin Ala Ser Asp Ser Leu Thr Gly Ala Glu Glu Thr 
180 185 190 

Leu Asn Gly Leu He Gin Phe Asp Ala Ala He Gin Pro Gly Asp Ser 
195 200 205 

Gly Gly Pro Val Val Asn Gly Leu Gly Gin Val Val Gly Met Asn Thr 
210 215 220 

Ala Ala Ser Asp Asn Phe Gin Leu Ser Gin Gly Gly Gin Gly Phe Ala 
225 230 235 240 

He Pro He Gly Gin Ala Met Ala He Ala Gly Gin He Arg Ser Gly 
245 250 255 

Gly Gly Ser Pro Thr Val His He Gly Pro Thr Ala Phe Leu Gly Leu 
260 265 270 

Gly Val Val Asp Asn Asn Gly Asn Gly Ala Arg Val Gin Arg Val Val 
275 280 285 

Gly Ser Ala Pro Ala Ala Ser Leu Gly He Ser Thr Gly Asp Val He 
290 295 300 

Thr Ala Val Asp Gly Ala Pro He Asn Ser Ala Thr Ala Met Ala Asp 
305 310 315 320 

Ala Leu Asn Gly His His Pro Gly Asp Val He Ser Val Asn Trp Gin 
325 330 335 

Thr Lys Ser Gly Gly Thr Arg Thr Gly Asn Val Thr Leu Ala Glu Gly 
340 345 350 

Pro Pro Ala 
355 
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(2) INFORMATION FOR SEQ ID NO: 81: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 205 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 81: 

Ser Pro Lys Pro Asp Ala Glu Glu Gin Gly Val Pro Val Ser Pro Thr 
I 5 10 15 

/' 

Ala Ser Asp Pro Ala Leu Leu Ala Glu lie Arg Gin Ser Leu ^sp Ala 
20 25 30 1 

Thr Lys Gly Leu Thr Ser Val His Val Ala Val Arg Thr Thr Gly Lys 
35 40 45 

Val Asp Ser Leu Leu Gly He Thr Ser Ala Asp Val Asp Val Arg Ala 
50 55 60 

Asn Pro Leu Ala Ala Lys Gly Val Cys Thr Tyr Asn Asp Glu Gin Gly 
65 70 75 80 

Val Pro Phe Arg Val Gin Gly Asp Asn He Ser Val Lys Leu Phe Asp 
85 90 95 

Asp Trp Ser Asn Leu Gly Ser He Ser Glu Leu Ser Thr Ser Arg Val 
100 105 HO 

Leu Asp Pro Ala Ala Gly Val Thr Gin Leu Leu Ser Gly Val Thr Asn 
115 120 125 

Leu Gin Ala Gin Gly Thr Glu Val He Asp Gly He Ser Thr Thr Lys 
130 135 140 

He Thr Gly Thr He Pro Ala Ser Ser Val Lys Met Leu Asp Pro Gly 
145- ' 150 155 160 

Ala Lys Ser Ala Arg Pro Ala Thr Val Trp He Ala Gin Asp Gly Ser 
165 170 175 

His His Leu Val Arg Ala Ser He Asp Leu Gly Ser Gly Ser He Gin 
180 185 190 

Leu Thr Gin Ser Lys Trp Asn Glu Pro Val Asn Val Asp 
195 200 205 



(2) INFORMATION FOR SEQ ID NO: 82: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 28 6 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 82: 

Gly Asp Ser Phe Trp Ala Ala Ala Asp Gin Met Ala Arg Gly Phe Val 
15 10 15 

Leu Gly Ala Thr Ala Gly Arg Thr Thr Leu Thr Gly Glu Gly Leu Gin 
20 25 30 

His Ala Asp Gly His Ser Leu Leu Leu Asp Ala Thr Asn Pro Ala Val 
35 40 45 

Val Ala Tyr Asp Pro Ala Phe Ala Tyr Glu lie Gly Tyr lie Xaa Glu 
50 55 60 

Ser Gly Leu Ala Arg Met Cys Gly Glu Asn Pro Glu Asn lie Phe Phe 
65 70 75 80 

Tyr lie Thr Val Tyr Asn Glu Pro Tyr Val Gin Pro Pro Glu Pro Glu 
85 90 95 

Asn Phe Asp Pro Glu Gly Val Leu Gly Gly lie Tyr Arg Tyr His Ala 
100 105 110 

Ala Thr Glu Gin Arg Thr Asn Lys Xaa Gin lie Leu Ala Ser Gly Val 
115 120 125 

Ala Met Pro Ala Ala Leu Arg Ala Ala Gin Met Leu Ala Ala Glu Trp 
130 135 140 

Asp Val Ala Ala Asp Val Trp Ser Val Thr Ser Trp Gly Glu Leu Asn 
145 150 155 160 

Arg Asp Gly Val Val lie Glu Thr Glu Lys Leu Arg His Pro Asp Arg 
165 170 175 

Pro Ala Gly Val Pro Tyr Val Thr Arg Ala Leu Glu Asn Ala Arg Gly 
180 185 190 

Pro Val lie Ala Val Ser Asp Trp Met Arg Ala Val Pro Glu Gin lie 
195 200 205 

Arg Pro Trp Val Pro Gly Thr Tyr Leu Thr Leu Gly Thr Asp Gly Phe 
210 215 220 



Gly Phe Ser Asp Thr Arg Pro Ala Gly Arg Arg Tyr Phe Asn Thr Asp 
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225 230 235 240 

Ala Glu Ser Gin Val Gly Arg Gly Phe Gly Arg Gly Trp Pro Gly Arg 
245 250 255 

Arg Val Asn He Asp Pro Phe Gly Ala Gly Arg Gly Pro Pro Ala Gin 
260 265 270 

Leu Pro Gly Phe Asp Glu Gly Gly Gly Leu Arg Pro Xaa Lys 
275 280 285 

(2) INFORMATION FOR SEQ ID NO: 83: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 173 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear / 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 83: 

Thr Lys Phe His Ala Leu Met Gin Glu Gin He His Asn Glu Phe Thr 
1 5 10 15 

Ala Ala Gin Gin Tyr Val Ala lie Ala Val Tyr Phe Asp Ser Glu Asp 
20 25 30 

Leu Pro Gin Leu Ala Lys His Phe Tyr Ser Gin Ala Val Glu Glu Arg 
35 40 45 

Asn His Ala Met Met Leu Val Gin His Leu Leu Asp Arg Asp Leu Arg 
50 55 60 

Val Glu He Pro Gly Val Asp Thr Val Arg Asn Gin Phe Asp Arg Pro 
65 70 75 80 

Arg Glu Ala Leu Ala Leu Ala Leu Asp Gin Glu Arg Thr Val Thr Asp 
85 90 95 

Gin Val Gly Arg Leu Thr Ala Val Ala Arg Asp Glu Gly Asp Phe Leu 
100 105 110 

Gly Glu Gin the Met Gin Trp Phe Leu Gin Glu Gin He Glu Glu Val 
115 120 125 

Ala Leu Met Ala Thr Leu Val Arg Val Ala Asp Arg Ala Gly Ala Asn 
130 135 140 

Leu Phe Glu Leu Glu Asn Phe Val Ala Arg Glu Val Asp Val Ala Pro 
145 150 155 160 
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Ala Ala Ser Gly Ala Pro His Ala Ala Gly Gly Arg Leu 
165 170 

(2) INFORMATION FOR SEQ ID NO: 84: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 107 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 84: 

Arg Ala Asp Glu Arg Lys Asn Thr Thr Met Lys Met Val Lys Ser He 
1 5 10 15 

Ala Ala Gly Leu Thr Ala Ala Ala Ala He Gly Ala Ala Ala Ala Gly 
20 25 3 0 

Val Thr Ser lie Met Ala Gly Gly Pro Val Val Tyr Gin Met Gin Pro 
35 40 45 

Val Val Phe Gly Ala Pro Leu Pro Leu Asp Pro Xaa Ser Ala Pro Xaa 
50 55 60 

Val Pro Thr Ala Ala Gin Trp Thr Xaa Leu Leu Asn Xaa Leu Xaa Asp 
" 70 75 80 P 

Pro Asn Val Ser Phe Xaa Asn Lys Gly Ser Leu Val Glu Gly Gly He 
85 90 95 

Gly Gly Xaa Glu Gly Xaa Xaa Arg Arg Xaa Gin 
!00 105 

(2) INFORMATION FOR SEQ ID NO: 85: 
(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 125 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 85: 

Val Leu Ser Val Pro Val Gly Asp Gly Phe Trp Xaa Arg Val Val Asn 

1 5 10 15 

Pro Leu Gly Gin Pro He Asp Gly Arg Gly Asp Val Asp Ser Asp Thr 



BNSDOCJD: <WO 981 6645 A2_L> 



WO 98/16645 PCT7US97/18214 

121 

20 25 30 

Arg Arg Ala Leu Glu Leu Gin Ala Pro Ser Val Val Xaa Arg Gin Gly 
35 40 45 

Val Lys Glu Pro Leu Xaa Thr Gly lie Lys Ala lie Asp Ala Met Thr 
50 55 60 

Pro lie Gly Arg Gly Gin Arg Gin Leu lie lie Gly Asp Arg Lys Thr 
65 70 75 80 

Gly Lys Asn Arg Arg Leu Cys Arg Thr Pro Ser Ser Asn Gin Arg Glu 
85 90 95 

Glu Leu Gly Val Arg Trp lie Pro Arg Ser Arg Cys Ala Cys Val Tyr 
100 105 110 

/ 

Val Gly His Arg Ala Arg Arg Gly Thr Tyr His Arg Arg 
115 120 125 

(2) INFORMATION FOR SEQ ID NO: 86: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 117 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 86: 

Cys Asp Ala Val Met Gly Phe Leu Gly Gly Ala Gly Pro Leu Ala Val 
15 10 15 

Val Asp Gin Gin Leu Val Thr Arg Val Pro Gin Gly Trp Ser Phe Ala 
20 25 30 

Gin Ala Ala Ala Val Pro Val Val Phe Leu Thr Ala Trp Tyr Gly Leu 

35 40 45 

Ala Asp Leu Ala Glu lie Lys Ala Gly Glu Ser Val Leu lie His Ala 
50 55 60 

Gly Thr Gly Gly Val Gly Met Ala Ala Val Gin Leu Ala Arg Gin Trp 
65 70 75 80 

Gly Val Glu Val Phe Val Thr Ala Ser Arg Gly Lys Trp Asp Thr Leu 
85 90 95 

Arg Ala Xaa Xaa Phe Asp Asp Xaa Pro Tyr Arg Xaa Phe Pro His Xaa 
100 105 110 
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Arg Ser Ser Xaa Gly 
115 

(2) INFORMATION FOR SEQ ID NO: 87: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 103 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



{xi) SEQUENCE DESCRIPTION: SEQ ID NO: 87: 

Met Tyr Arg Phe Ala Cys Arg Thr Leu Met Leu Ala Ala Cys lie Leu 
15 10 15 

Ala Thr Gly Val Ala Gly Leu Gly Val Gly Ala Gin Ser Ala Ala Gin 
20 25 30 

Thr Ala Pro Val Pro Asp Tyr Tyr Trp Cys Pro Gly Gin Pro Phe Asp 
35 40 45 

Pro Ala Trp Gly Pro Asn Trp Asp Pro Tyr Thr Cys His Asp Asp Phe 
50 55 60 

His Arg Asp Ser Asp Gly Pro Asp His Ser Arg Asp Tyr Pro Gly Pro 
65 " ' 70 75 80 

He Leu Glu Gly Pro Val Leu Asp Asp Pro Gly Ala Ala Pro Pro Pro 
85 90 95 

Pro Ala Ala Gly Gly Gly Ala 
100 

(2) INFORMATION FOR SEQ ID NO: 88: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 88 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 88: 

Val Gin Cys Arg Val Trp Leu Glu He Gin Trp Arg Gly Met Leu Gly 
1 5 10 15 
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Ala Asp Gin Ala Arg Ala Gly Gly Pro Ala Arg lie Trp Arg Glu His 
20 25 30 

Ser Met Ala Ala Met Lys Pro Arg Thr Gly Asp Gly Pro Leu Glu Ala 
35 40 45 

Thr Lys Glu Gly Arg Gly lie Val Met Arg Val Pro Leu Glu Gly Gly 
50 55 • 60 

Gly Arg Leu Val Val Glu Leu Thr Pro Asp Glu Ala Ala Ala Leu Gly 
65 " 70 75 80 

Asp Glu Leu Lys Gly Val Thr Ser 
85 

(2) INFORMATION FOR SEQ ID NO: 89: 

(i) SEQUENCE CHARACTERISTICS: / 

(A) LENGTH: 95 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 89: 

Thr Asp Ala Ala Thr Leu Ala Gin Glu Ala Gly Asn Phe Glu Arg He 
1 5 10 15 

Ser Gly Asp Leu Lys Thr Gin He Asp Gin Val Glu Ser Thr Ala Gly 
20 25 30 

Ser Leu Gin Gly Gin Trp Arg Gly Ala Ala Gly Thr Ala Ala Gin Ala 
35 40 45 

Ala Val Val Arg Phe Gin Glu Ala Ala Asn Lys Gin Lys Gin Glu Leu 
50 55 60 

Asp Glu He Ser Thr Asn He Arg Gin Ala Gly Val Gin Tyr Ser Arg 
65 70 75 80 

Ala Asp Glu Glu Gin Gin Gin Ala Leu Ser Ser Gin Met Gly Phe 
85 90 95 

(2) INFORMATION FOR SEQ ID NO: 90: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 166 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 90: 

Met Thr Gin Ser Gin Thr Val Thr Val Asp Gin Gin Glu He Leu Asn 
1 5 io 15 

Arg Ala Asn Glu Val Glu Ala Pro Met Ala Asp Pro Pro Thr Asp Val 
20 25 30 

Pro He Thr Pro Cys Glu Leu Thr Xaa Xaa Lys Asn Ala Ala Gin Gin 
35 40 45 

Xaa Val Leu Ser Ala Asp Asn Met Arg Glu Tyr Leu Ala Ala Gly Ala 
50 55 60 



Lys Glu Arg Gin Arg Leu Ala Thr Ser Leu Arg Asn Ala Ala Lys Xaa 

/ 



- - ^ x-ij-a uy ; 

65 ™ 75 / so 



Tyr Gly Glu Val Asp Glu Glu Ala Ala Thr Ala Leu Asp Asn Asp Gly 
85 90 95 

Glu Gly Thr Val Gin Ala Glu Ser Ala Gly Ala Val Gly Gly Asp Ser 
100 105 no 

Ser Ala Glu Leu Thr Asp Thr Pro Arg Val Ala Thr Ala Gly Glu Pro 
115 120 125 

Asn Phe Met Asp Leu Lys Glu Ala Ala Arg Lys Leu Glu Thr Gly Asp 
130 135 140 

Gin Gly Ala Ser Leu Ala His Xaa Gly Asp Gly Trp Asn Thr Xaa Thr 
145 "0 155 leo 

Leu Thr Leu Gin Gly Asp 
165 

(2) INFORMATION FOR SEQ ID NO: 91: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 91: 

Arg Ala Glu Arg Met 
1 5 
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(2) INFORMATION FOR SEQ ID NO: 92: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 263 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 92: 

Val Ala Trp Met Ser Val Thr Ala Gly Gin Ala Glu Leu Thr Ala Ala 
! 5 10 15/ 

Gin Val Arg Val Ala Ala Ala Ala Tyr Glu Thr Ala Tyr Gly Leu Thr 
20 25 30 

Val Pro Pro Pro Val He Ala Glu Asn Arg Ala Glu Leu Met He Leu 
35 40 45 

He Ala Thr Asn Leu Leu Gly Gin Asn Thr Pro Ala He Ala Val Asn 
50 55 60 

Glu Ala Glu Tyr Gly Glu Met Trp Ala Gin Asp Ala Ala Ala Met Phe 
65 ~ 70 75 80 

Glv Tyr Ala Ala Ala Thr Ala Thr Ala Thr Ala Thr Leu Leu Pro Phe 
85 90 95 

Glu Glu Ala Pro Glu Met Thr Ser Ala Gly Gly Leu Leu Glu Gin Ala 
100 105 HO 

Ala Ala Val Glu Glu Ala Ser Asp Thr Ala Ala Ala Asn Gin Leu Met 
115 120 125 

Asn Asn Val Pro Gin Ala Leu Lys Gin Leu Ala Gin Pro Thr Gin Gly 
130 135 140 

Thr Thr Pro Ser Ser Lys Leu Gly Gly Leu Trp Lys Thr Val Ser Pro 
145 150 155 160 

His Arg Ser Pro He Ser Asn Met Val Ser Met Ala Asn Asn His Met 
165 170 175 

Ser Met Thr Asn Ser Gly Val Ser Met Thr Asn Thr Leu Ser Ser Met 
180 185 190 

Leu Lys Gly Phe Ala Pro Ala Ala Ala Ala Gin Ala Val Gin Thr Ala 
195 200 205 



Ala 



Gin Asn Gly Val Arg Ala Met Ser Ser Leu Gly Ser Ser Leu Gly 
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210 215 220 

Ser Ser Gly Leu Gly Gly Gly Val Ala Ala Asn Leu Gly Arg Ala Ala 
225 230 235 240 

Ser Val Arg Tyr Gly His Arg Asp Gly Gly Lys Tyr Ala Xaa Ser Gly 
24 5 250 255 

Arg Arg Asn Gly Gly Pro Ala 
260 

(2) INFORMATION FOR SEQ ID NO: 93: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 303 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DE! 

Met Thr Tyr Ser 
1 

Gly Ser Tyr Gly 
20 

Ala Ser Lys Leu 
35 

Leu Ala Ala Tyr 
50 

Glu Leu Gly Gly 
65 

Val Gly Val Ala 



CRIPTION: SEQ II 

Pro Gly Asn Pro 
5 

Gly Val Thr Pro 



Pro Met Tyr Leu 
40 

Phe Ala Ser Phe 
55 

Gly Asp Gly Ala 
70 

Leu Leu Ala Ala 
85 



i NO: 93: 

Gly Tyr Pro Gin 
10 

Ser Phe Ala His 

25 

Asn He Ala Val 



Gly Pro Met Phe 
60 

Val Ser Gly Asp 

Leu Leu Ala Gly 
90 



Ala Gin Pro Ala 
15 

Ala Asp Glu Gly 
30 

Ala Val Leu Gly 
45 

Thr Leu Ser Thr 



Thr Gly Leu Pro 
80 

Val Val Leu Val 
95 



Pro Lys Ala Lys Ser 
100 

Gly Val Phe Leu Met 
115 

Ser Thr Gly Trp Ala 
130 

Ala Val Ala Ala Val 
145 



His Val Thr Val Val Ala 
105 

Val Ser Ala Thr Phe Asn 
120 

Leu Trp Val Val Leu Ala 
135 

Leu Ala Leu Leu Val Glu 
150 155 



Val Leu Gly Val Leu 
110 

Lys Pro Ser Ala Tyr 
125 

Phe He Val Phe Gin 
140 

Thr Gly Ala He Thr 
160 
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Ala Pro Ala Pro Arg Pro Lys Phe Asp Pro Tyr Gly Gin Tyr Gly Arg 
165 170 175 

Tyr Gly Gin Tyr Gly Gin Tyr Gly Val Gin Pro Gly Gly Tyr Tyr Gly 
180 185 190 

Gin Gin Gly Ala Gin Gin Ala Ala Gly Leu Gin Ser Pro Gly Pro Gin 
195 200 205 

Gin Ser Pro Gin Pro Pro Gly Tyr Gly Ser Gin Tyr Gly Gly Tyr Ser 
210 215 220 

Ser Ser Pro Ser Gin Ser Gly Ser Gly Tyr Thr Ala Gin Pro Pro Ala 
225 230 235 240 

Gin Pro Pro Ala Gin Ser Gly Ser Gin Gin Ser His Gin Gly Pro Ser 
245 250 255/ 

// 

Thr Pro Pro Thr Gly Phe Pro Ser Phe Ser Pro Pro Pro Pro Val Ser 
260 265 270 



Ala Gly Thr Gly Ser Gin Ala Gly 
275 280 

Pro Ser Gly Gly Glu Gin Ser Ser 
290 295 

(2) INFORMATION FOR SEQ ID NO: 94: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 507 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



Ser Ala Pro Val Asn Tyr Ser Asn 
285 

Ser Pro Gly Gly Ala Pro Val 
300 . 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 94: 
ATGAAGATGG TGAAATCGAT CGCCGCAGGT CTGACCGCCG CGGCTGCAAT CGGCGCCGCT 60 

GCGGCCGGTG TGACTTCGAT CATGGCTGGC GGCCCGGTCG TATACCAGAT GCAGCCGGTC 120 

GTCTTCGGCG CGCCACTGCC GTTGGACCCG GCATCCGCCC CTGACGTCCC GACCGCCGCC 180 

CAGTTGACCA GCCTGCTCAA CAGCCTCGCC GATCCCAACG TGTCGTTTGC GAACAAGGGC 24 0 

AGTCTGGTCG AGGGCGGCAT CGGGGGCACC GAGGCGCGCA TCGCCGACCA CAAGCTGAAG 300 

AAGGCCGCCG AGCACGGGGA TCTGCCGCTG TCGTTCAGCG TGACGAACAT CCAGCCGGCG 360 

GCCGCCGGTT CGGCCACCGC CGACGTTTCC GTCTCGGGTC CGAAGCTCTC GTCGCCGGTC 4 20 
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ACGCAGAACG TCACGTTCGT GAATCAAGGC GGCTGGATGC TGTCACGCGC ATCGGCGATG 4 80 

GAGTTGCTGC AGGCCGCAGG GAACTGA 507 



(2) INFORMATION FOR SEQ ID NO: 95: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 168 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 95: 

Met Lys Met Val Lys Ser lie Ala Ala Gly Leu Thr Ala Ala Ala Ala 
15 10 15 

He Gly Ala Ala Ala Ala Gly Val Thr Ser He Met Ala Gly Gly Pro 
20 25 30 

Val Val Tyr Gin Met Gin Pro Val Val Phe Gly Ala Pro Leu Pro Leu 
35 40 45 

Asp Pro Ala Ser Ala Pro Asp Val Pro Thr Ala Ala Gin Leu Thr Ser 
50 55 60 

Leu Leu Asn Ser Leu Ala Asp Pro Asn Val Ser Phe Ala Asn Lys Gly 
65 70 75 80 

Ser Leu Val Glu Gly Gly He Gly Gly Thr Glu Ala Arg lie Ala Asp 
85 90 95 

His Lys Leu Lys Lys Ala Ala Glu His Gly Asp Leu Pro Leu Ser Phe 
100 105 110 

Ser Val Thr Asn He Gin Pro Ala Ala Ala Gly Ser Ala Thr Ala Asp 
115 120 125 

Val Ser Val Ser Gly Pro Lys Leu Ser Ser Pro Val Thr Gin Asn Val 
130 135 140 

Thr Phe Val Asn Gin Gly Gly Trp Met Leu Ser Arg Ala Ser Ala Met 
145 150 155 160 

Glu Leu Leu Gin Ala Ala Gly Asn 
165 

(2) INFORMATION FOR SEQ ID NO: 96: 
(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 500 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 96: 

CGTGGCAATG TCGTTGACCG TCGGGGCCGG GGTCGCCTCC GCAGATCCCG TGGACGCGGT 60 

CATTAACACC ACCTGCAATT ACGGGCAGGT AGTAGCTGCG CTCAACGCGA CGGATCCGGG 120 

GGCTGCCGCA CAGTTCAACG CCTCACCGGT GGCGCAGTCC TATTTGCGCA ATTTCCTCGC 180 

CGCACCGCCA CCTCAGCGCG CTGCCATGGC CGCGCAATTG CAAGCTGTGC CGGGGGCGGC 24 0 

ACAGTACATC GGCCTTGTCG AGTCGGTTGC CGGCTCCTGC AACAACTATT AAGCCCATGC 300 

GGGCCCCATC CCGCGACCCG GCATCGTCGC CGGGGCTAGG CCAGATTGCC CCGCTCCTCA 360 

ACGGGCCGCA TCCCGCGACC CGGCATCGTC GCCGGGGCTA GGCCAGATTG CCCCGCTCCT 420 

CAACGGGCCG CATCTCGTGC CGAATTCCTG CAGCCCGGGG GATCCACTAG TTCTAGAGCG 4 80 

GCCGCCACCG CGGTGGAGCT 500 
(2) INFORMATION FOR SEQ ID NO: 97: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 96 amino acids 

(B) TYPE:, amino, acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 97: 

Val Ala Met Ser Leu Thr Val Gly Ala Gly Val Ala Ser Ala Asp Pro 

1 5 10 15 

Val Asp Ala Val He Asn Thr Thr Cys Asn Tyr Gly Gin Val Val Ala 
20 25 30 

Ala Leu Asn Ala Thr Asp Pro Gly Ala Ala Ala Gin Phe Asn Ala Ser 
35 40 45 

Pro Val Ala Gin Ser Tyr Leu Arg Asn Phe Leu Ala Ala Pro Pro Pro 
50 55 60 
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Gin Arg Ala Ala Met Ala Ala Gin Leu Gin Ala Val Pro Gly Ala Ala 

70 ^ 80 

Gin Tyr He Gly Leu Val Glu Ser Val Ala Gly Ser Cys Asn Asn Tyr 
85 90 95 

(2) INFORMATION FOR SEQ ID NO: 98: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 154 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 98: 
ATGACAGAGC AGCAGTGGAA TTTCGCGGGT ATCGAGGCCG CGGCAAGCGC AATCCAGGGA 
AATGTCACGT CCATTCATTC CCTCCTTGAC GAGGGGAAGC AGTCCCTGAC CAAGCTCGCA 
GCGGCCTGGG GCGGTAGCGG TTCGGAAGCG TACC 
(2) INFORMATION FOR SEQ ID NO: 99: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 51 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



60 
120 
154 



(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 99: 

Met Thr Glu Gin Gin Trp Asn Phe Ala Gly lie Glu Ala Ala Ala Ser 

Ala He Gin Gly Asn Val Thr Ser He His Ser Leu Leu Asp Glu Gly 
20 25 3 0 

Lys Gin Ser Leu Thr Lys Leu Ala Ala Ala Trp Gly Gly Ser Gly 



35 40 

Glu Ala Tyr 
50 

(2) INFORMATION FOR SEQ ID NO: 100: 
(i) SEQUENCE CHARACTERISTICS: 



Ser 

45 
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(A) LENGTH: 282 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 100: 



CGGTCGCGCA 


CTTCCAGGTG 


ACTATGAAAG 


TCGGCTTCCG 


NCTGGAGGAT 


TCCTGAACCT 


60 


TCAAGCGCGG 


CCGATAACTG 


AGGTGCATCA 


TTAAGCGACT 


TTTCCAGAAC 


ATCCTGACGC 


120 


GCTCGAAACG 


CGGCACAGCC 


GACGGTGGCT 


CCGNCGAGGC 


GCTGNCTCCA 


AAATCCCTGA 


180 


GACAATTCGN 


CGGGGGCGCC 


TACAAGGAAG 


TCGGTGCTGA 


ATTCGNCGNG 


TATCTGGTjbG 


240 


ACCTGTGTGG 


TCTGNAGCCG 


GACGAAGCGG 


TGCTCGACGT 


CG 




282 



(2) INFORMATION FOR SEQ ID NO: 101: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3058 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 101: 

GATCGTACCC GTGCGAGTGC TCGGGCCGTT TGAGGATGGA GTGCACGTGT CTTTCGTGAT 60 

GGCATACCCA GAGATGTTGG CGGCGGCGGC TGACACCCTG CAGAGCATCG GTGCTACCAC 120 

TGTGGCTAGC AATGCCGCTG CGGCGGCCCC GACGACTGGG GTGGTGCCCC CCGCTGCCGA 180 

TGAGGTGTCG GCGCTGACTG CGGCGCACTT CGCCGCACAT GCGGCGATGT ATCAGTCCGT 24 0 

GAGCGCTCGG GCTGCTGCGA TTCATGACCA GTTCGTGGCC ACCCTTGCCA GCAGCGCCAG 300 

CTCGTATGCG GCCACTGAAG TCGCCAATGC GGCGGCGGCC AGCTAAGCCA GGAACAGTCG 360 

GCACGAGAAA CCACG AGAAA TAGGGACACG TAATGGTGGA TTTCGGGGCG TTACCACCGG 4 20 

AGATCAACTC CGCGAGGATG TACGCCGGCC CGGGTTCGGC CTCGCTGGTG GCCGCGGCTC 4 80 

AGATGTGGGA CAGCGTGGCG AGTGACCTGT TTTCGGCCGC GTCGGCGTTT CAGTCGGTGG 54 0 

TCTGGGGTCT GACGGTGGGG TCGTGGATAG GTTCGTCGGC GGGTCTGATG GTGGCGGCGG 600 
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204 0 


ACAGGTTCGA 


TGACCATCAA 


CTATCAATTC 


GGGGATGTCG 


ACGCTCACGG 


CGCCATGATC 


2100 


CGCGCTCAGG 


CCGGGTTGCT 


GGAGGCCGAG 


CATCAGGCCA 


TCATTCGTGA 


TGTGTTGACC 


2160 


GCGAGTGACT 


TTTGGGGCGG 


CGCCGGTTCG 


GCGGCCTGCC 


AGGGGTTCAT 


TACCCAGTTG 


2220 


GGCCGTAACT 


TCCAGGTGAT 


CTACGAGCAG 


GCCAACGCCC 


ACGGGCAGAA 


GGTGCAGGCT 


2280 
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GCCGGCAACA ACATGGCGCA AACCGACAGC GCCGTCGGCT CCAGCTGGGC CTGACACCAG 2340 

GCCAAGGCCA GGGACGTGGT GTACGAGTGA AGTTCCTCGC GTGATCCTTC GGGTGGCAGT 24 00 

CTAAGTGGTC AGTGCTGGGG TGTTGGTGGT TTGCTGCTTG GCGGGTTCTT CGGTGCTGGT 24 60 

CAGTGCTGCT CGGGCTCGGG TGAGGACCTC GAGGCCCAGG TAGCGCCGTC CTTCGATCCA 2520 

TTCGTCGTGT TGTTCGGCGA GGACGGCTCC GACGAGGCGG ATGATCGAGG CGCGGTCGGG 2580 

GAAGATGCCC ACGACGTCGG TTCGGCGTCG TACCTCTCGG TTGAGGCGTT CCTGGGGGTT 2 64 0 

GTTGGACCAG ATTTGGCGCC AGATCTGCTT GGGGAAGGCG GTGAACGCCA GCAGGTCGGT 2700 

GCGGGCGGTG TCGAGGTGCT CGGCCACCGC GGGGAGTTTG TCGGTCAGAG CGTCGAGTAC 27 60 

CCGATCATAT TGGGCAACAA CTGATTCGGC GTCGGGCTGG TCGTAGATGG AGTGCAGCAG 2820 

GGTGCGCACC CACGGCCAGG AGGGCTTCGG GGTGGCTGCC ATCAGATTGG CTGCGTAGTG 28 80 

GGTTCTGCAG CGCTGCCAGG CCGCTGCGGG CAGGGTGGCG CCGATCGCGG CCACCAGGCC 2 94 0 

GGCGTGGGCG TCGCTGGTGA CCAGCGCGAC CCCGGACAGG CCGCGGGCGA CCAGGTCGCG 3000 

GAAGAACGCC AGCCAGCCGG CCCCGTCCTC GGCGGAGGTG ACCTGGATGC CCAGGATC 3058 
(2) INFORMATION FOR SEQ ID NO: 102: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 391 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 102: 

Met Val Asp Phe Gly Ala Leu Pro Pro Glu He Asn Ser Ala Arg Met 
1 5 10 15 

Tyr Ala Gly Pro Gly Ser Ala Ser Leu Val Ala Ala Ala Gin Met Trp 
20 25 30 

Asp Ser Val Ala Ser Asp Leu Phe Ser Ala Ala Ser Ala Phe Gin Ser 
35 4 0 4 5 

Val Val Trp Gly Leu Thr Val Gly Ser Trp He Gly Ser Ser Ala Gly 
50 55 60 

Leu Met Val Ala Ala Ala Ser Pro Tyr Val Ala Trp Met Ser Val Thr 
65 70 75 80 
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Ala Gly Gin Ala Glu Leu Thr Ala Ala Gin Val Arg Val Ala Ala Ala 
85 90 95 

Ala Tyr Glu Thr Ala Tyr Gly Leu Thr Val Pro Pro Pro Val He Ala 
100 105 110 

Glu Asn Arg Ala Glu Leu Met lie Leu He Ala Thr Asn Leu Leu Glv 
115 120 125 

Gin Asn Thr Pro Ala He Ala Val Asn Glu Ala Glu Tyr Gly Glu Met 
I 30 135 i 4 o 

Trp Ala Gin Asp Ala Ala Ala Met Phe Gly Tyr Ala Ala Ala Thr Al 
145 150 155 i 6 



a 
60 



Thr Ala Thr Ala Thr Leu Leu Pro Phe Glu Glu Ala Pro Glu Met Thr 
165 170 175 



I 



Ser Ala Gly Gly Leu Leu Glu Gin Ala Ala Ala Val Glu Glu Ala Ser 
180 185 190 

Asp Thr Ala Ala Ala Asn Gin Leu Met Asn Asn Val Pro Gin Ala Leu 
195 200 205 

Gin Gin Leu Ala Gin Pro Thr Gin Gly Thr Thr Pro Ser Ser Lys Leu 
210 215 220 



Gly Gly Leu Trp Lys Thr Val Ser Pro His Arg Ser Pro He Ser Asn 
225 230 235 240 



Met Val Ser Met Ala Asn Asn His Met Ser Met Thr Asn Ser Gly Val 
245 250 255 

Ser Met Thr Asn Thr Leu Ser Ser Met Leu Lys Gly Phe Ala Pro Ala 
260 265 270 

Ala Ala Ala Gin Ala Val Gin Thr Ala Ala Gin Asn Gly Val Arg Ala 
275 280 285 

Met Ser Ser Leu Gly Ser Ser Leu Gly Ser Ser Gly Leu Gly Glv Glv 
290 295 300 

Val Ala Ala Asn Leu Gly Arg Ala Ala Ser Val Gly Ser Leu Ser Val 
305 310 315 " 320 

Pro Gin Ala Trp Ala Ala Ala Asn Gin Ala Val Thr Pro Ala Ala Ara 



325 



330 



335 



Ala Leu Pro Leu Thr Ser Leu Thr Ser Ala Ala Glu Arg Gly Pro Gly 
34 0 



345 



350 



Gin Met Leu Gly Gly Leu Pro Val Gly Gin Met Gly Ala Arg Ala Gly 
355 360 365 
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Gly Gly Leu Ser Gly Val Leu Arg Val Pro Pro Arg Pro Tyr Val Met 
370 375 380 

Pro His Ser Pro Ala Ala Gly 
385 390 

(2) INFORMATION FOR SEQ ID NO: 103: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1725 base pairs 
<B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



/ 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 103: / 

GACGTCAGCA CCCGCCGTGC AGGGCTGGAG CGTGGTCGGT TTTGATCTGC GGTCAAGGTG 60 

ACGTCCCTCG GCGTGTCGCC GGCGTGGATG CAGACTCGAT GCCGCTCTTT AGTGCAACTA 120 

ATTTCGTTGA AGTGCCTGCG AGGTATAGGA CTTCACGATT GGTTAATGTA GCGTTCACCC 18 0 

CGTGTTGGGG TCGATTTGGC CGGACCAGTC GTCACCAACG CTTGGCGTGC GCGCCAGGCG 24 0 

GGCGATCAGA TCGCTTGACT ACCAATCAAT CTTGAGCTCC CGGGCCGATG CTCGGGCTAA 300 

ATGAGGAGGA GCACGCGTGT CTTTCACTGC GCAACCGGAG ATGTTGGCGG CCGCGGCTGG 360 

CGAACTTCGT TCCCTGGGGG CAACGCTGAA GGCTAGCAAT GCCGCCGCAG CCGTGCCGAC 420 

GACTGGGGTG GTGCCCCCGG CTGCCGACGA GGTGTCGCTG CTGCTTGCCA CACAATTCCG 480 

TACGCATGCG GCGACGTATC AGACGGCCAG CGCCAAGGCC GCGGTGATCC ATGAGCAGTT 54 0 

TGTGACCACG CTGGCCACCA GCGCTAGTTC ATATGCGGAC ACCGAGGCCG CCAACGCTGT 600 

GGTCACCGGC TAGCTGACCT GACGGTATTC GAGCGGAAGG ATTATCGAAG TGGTGGATTT 660 

CGGGGCGTTA CCACCGGAGA TCAACTCCGC GAG GAT G T AC GCCGGCCCGG GTTCGGCCTC 720 

GCTGGTGGCC GCCGCGAAGA TGTGGGACAG CGTGGCGAGT GACCTGTTTT CGGCCGCGTC 7 80 

GGCGTTTCAG TCGGTGGTCT GGGGTCTGAC GGTGGGGTCG TGGATAGGTT CGTCGGCGGG 84 0 

TCTGATGGCG GCGGCGGCCT CGCCGTATGT GGCGTGGATG AGCGTCACCG CGGGGCAGGC 900 

CCAGCTGACC GCCGCCCAGG TCCGGGTTGC TGCGGCGGCC TACGAGACAG CGTATAGGCT 960 

GACGGTGCCC CCGCCGGTGA TCGCCGAGAA CCGTACCGAA CTGATGACGC TGACCGCGAC 1020 

CAACCTCTTG GGGCAAAACA CGCCGGCGAT CGAGGCCAAT CAGGCCGCAT ACAGCCAGAT 108 0 



:NSDOCID: <WO_9816645A2_l_> 



WO 98/16645 



136 



PCT/US97/18214 



GTGGGGCCAA GACGCGGAGG CGATGTATGG 


CTACGCCGCC 


ACGGCGGPGA 


<~<jL>U(jACCGA 


1140 


GGCGTTGCTG CCGTTCGAGG ACGCCCCACT 


GATCACCAAC 




XCCTTGAGCA 


1200 


GGCCGTCGCG GTCGAGGAGG CCATCGACAC 


CGCCGCGGCG 




l CjAACAAT GT 


1260 


GCCCCAAGCG CTGCAACAGC TGGCCCAGCC 


AGCGCAGGGC 




CTTCCAAGCT 


1320 


GGGTGGGCTG TGGACGGCGG TCTCGCCGCA 


TCTGTCGrrn 




TCAGTTCGAT 


1380 


AGCCAACAAC CACATGTCGA TGATGGGCAC 




ATGACCAACA 


CCTTGCACTC 


1440 


GATGTTGAAG GGCTTAGCTC CGGCGGCGGC 


TCAGGCCGTG 


GAAACCGCGG 


CGGAAAACGG 


1500 


GGTCTGGGCG ATGAGCTCGC TGGGCAGCCA 


GCTGGGTTCG 


TCGCTGGGTT 


CTTCGGGTCT 


1560 


GGGCGCTGGG GTGGCCGCCA ACTTGGGTCG 


GGCGGCCTCG 


GTCGGTTCGT 


TGTCGGTGCC 


1620 


GCCAGCATGG GCCGCGGCCA ACCAGGCGGT 


CACCCCGGCG 


GCGCGGGCGC 


TGCCGCTGAC 


1680 


CAGCCTGAGC AGCGCCGCCC AAACCGCCCC 


CGGACACATG 


CTGGG 




1725 


(2) INFORMATION FOR SEQ ID NO: 104 











(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 359 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:104: 

Val Val Asp ?he Gly Ala Leu Pro Pro Glu He Asn Ser Ala Arg Met 
5 10 15 

Tyr Ala Gly Pro Gly Ser Ala Ser Leu Val Ala Ala Ala Lys Met Tro 
20 o s H 



30 



Asp Ser Val Ala Ser Asp Leu Phe Ser Ala Ala Ser Ala Phe Gin Ser 
35 40 45 

Val Val Trp Sly Leu Thr Val Gly Ser Trp lie Gly Ser Ser Ala Gly 
50 55 60 



Leu Met Ala Ala Ala Ala Ser Pro Tyr Val Ala Trp Met Ser Val Thr 
65 70 75 80 

Ala Gly Gin Ala Gin Leu Thr Ala Ala Gin Val Arg Val Ala Ala Ala 

95 



85 go 
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Ala Tyr Glu Thr Ala Tyr Arg Leu Thr Val Pro Pro Pro Val lie Ala 
100 105 110 

Glu Asn Arg Thr Glu Leu Met Thr Leu Thr Ala Thr Asn Leu Leu Gly 
115 120 125 

Gin Asn Thr Pro Ala He Glu Ala Asn Gin Ala Ala Tyr Ser Gin Met 
130 135 140 

Trp Gly Gin Asp Ala Glu Ala Met Tyr Gly Tyr Ala Ala Thr Ala Ala 
145 150 155 160 

Thr Ala Thr Glu Ala Leu Leu Pro Phe Glu Asp Ala Pro Leu He Thr 
165 170 175 

Asn Pro Gly Gly Leu Leu Glu Gin Ala Val Ala Val Glu Glu Ala He 
180 185 190 

/ 

Asp Thr Ala Ala Ala Asn Gin Leu Met Asn Asn Val Pro Gin Ala Leu 
195 200 205 

Gin Gin Leu Ala Gin Pro Ala Gin Gly Val Val Pro Ser Ser Lys Leu 
210 215 220 

Gly Gly Leu Trp Thr Ala Val Ser Pro His Leu Ser Pro Leu Ser Asn 
225 J 230 235 240 

Val Ser Ser He Ala Asn Asn His Met Ser Met Met Gly Thr Gly Val 
245 250 255 

Ser Met Thr Asn Thr Leu His Ser Met Leu Lys Gly Leu Ala Pro Ala 
260 265 270 

Ala Ala Gin Ala Val Glu Thr Ala Ala Glu Asn Gly Val Tr/p Ala Met 
275 280 285 

Ser Ser Leu Gly Ser Gin Leu Gly Ser Ser Leu Gly Ser Ser Gly Leu 
290 295 300 

Gly Ala Gly Val Ala Ala Asn Leu Gly Arg Ala Ala Ser Val Gly Ser 
305 ' 310 315 320 

Leu Ser Val Pro Pro Ala Trp Ala Ala Ala Asn Gin Ala Val Thr Pro 
325 330 335 

Ala Ala Arg Ala Leu Pro Leu Thr Ser Leu Thr Ser Ala Ala Gin Thr 
340 345 350 

Ala Pro Gly His Met Leu Gly 
355 

(2) INFORMATION FOR SEQ ID NO: 10 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3027 base pairs 
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(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 105: 
AGTTCAGTCG AGAATGATAC TGACGGGCTG TATCCACGAT GGCTGAGACA ACCGAACCAC 
CGTCGGACGC GGGGACATCG CAAGCCGACG CGATGGCGTT GGCCGCCGAA GCCGAAGCCG 
CCGAAGCCGA AGCGCTGGCC GCCGCGGCGC GGGCCCGTGC CCGTGCCGCC CGGTTGAAGC 

GTGAGGCGCT GGCGATGGCC CCAGCCGAGG ACGAGAACGT CCCCGAGGAT ATGCAGACTG 

/ 

GGAAGACGCC GAAGACTATG ACGACTATGA CGACTATGAG GCCGCAGACC AGGAGGCCGC 

/ 

ACGGTCGGCA TCCTGGCGAC GGCGGTTGCG GGTGCGGTTA CCAAGACTGT CCACGATTGC 
CATGGCGGCC GCAGTCGTCA TCATCTGCGG CTTCACCGGG CTCAGCGGAT ACATTGTGTG 
GCAACACCAT GAGGCCACCG AACGCCAGCA GCGCGCCGCG GCGTTCGCCG CCGGAGCCAA 
GCAAGGTGTC ATCAACATGA CCTCGCTGGA CTTCAACAAG GCCAAAGAAG ACGTCGCGCG 
TGTGATCGAC AGCTCCACCG GCGAATTCAG GGATGACTTC CAGCAGCGGG CAGCCGATTT 
CACCAAGGTT GTCGAACAGT CCAAAGTGGT CACCGAAGGC ACGGTGAACG CGACAGCCGT 
CGAATCCATG AACGAGCATT CCGCCGTGGT GCTCGTCGCG GCGACTTCAC GGGTCACCAA 
TTCCGCTGGG GCGAAAGACG AACCACGTGC GTGGCGGCTC AAAGTGACCG TGACCGAAGA 
GGGGGGACAG TACAAGATGT CGAAAGTTGA GTTCGTACCG TGACCGATGA CGTACGCGAC 
GTCAACACCG AAACCACTGA CGCCACCGAA GTCGCTGAGA TCGACTCAGC CGCAGGCGAA 
GCCGGTGATT CGGCGACCGA GGCATTTGAC ACCGACTCTG CAACGGAATC TACCGCGCAG 
AAGGGTCAGC GGCACCGTGA CCTGTGGCGA ATGCAGGTTA CCTTGAAACC CGTTCCGGTG 
ATTCTCATCC TGCTCATGTT GATCTCTGGG GGCGCGACGG GAT GGC TATA CCTTGAGCAA 
TACGACCCGA TCAGCAGACG GACTCCGGCG CCGCCCGTGC TGCCGTCGCC GCGGCGTCTG 
ACGGGACAAT CGCGCTGTTG TGTATTCACC CGACACGTCG ACCAAGACTT CGCTACCGCC 
AGGTCGCACC TCGCCGGCGA TTTCCTGTCC TATACGACCA GTTCACGCAG CAGATCGTGG 
CTCCGGCGGC CAAACAGAAG TCACTGAAAA CCACCGCCAA GGTGGTGCGC GCGGCCGTGT 
CGGAGCTACA TCCGGATTCG GCCGTCGTTC TGGTTTTTGT CGACCAGAGC ACTACCAGTA 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
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AGGACAGCCC 
ACGGCAATTG 
TCTGACGGGG 
GCCCGACCTC 
CCACTGTATT 
TGGTGCACTT 
GTTTGCTGTC 
CTTCGGGGCG 
CTCGCTGGTG 
GTCGGCGTTT 
GGGTCTGATG 
GGCCGAGCTG 
GCTGACGGTG 
GACCAACCTC 
GATGTGGGCC 
CGAGGCGTTG 
GCAGGCCGTC 
TGTGCCCCAA 
ACTGAGTGAA 
GATGCTCAAC 
CTCAATGTTG 
CGGGGTCCAG 
TCTGGGCGCT 
GCCGCAGGCC 
GACCAGCCTG 
GGGGCAACTG 
GCCGCGGGCG 



CAATCCGTCG ATGGCGGCCA GCAGCGTGAT GGTGACCCTA GCCAAGGTCG 14 4 0 

GCTGATCACC AAGTTCACCC CGGTTTAGGT TGCCGTAGGC GGTCGCCAAG 1500 

GCGCGGGTGG CTGCTCGTGC GAGATACCGG CCGTTCTCCG GACAATCACG 1560 

AAACAGATCT CGGCCGCTGT CTAATCGGCC GGGTTATTTA AGATTAGTTG 1620 

TACCTGATGT TCAGATTGTT CAGCTGGATT TAGCTTCGCG GCAGGGCGGC 1680 

TGCATCTGGG GTTGTGACTA CTTGAGAGAA TTTGACCTGT TGCCGACGTT 1740 

CATCATTGGT GCTAGTTATG GCCGAGCGGA AGGATTATCG AAGTGGTGGA 1800 

TTACCACCGG AGATCAACTC CGCGAGGATG TACGCCGGCC CGGGTTCGGC 1860 

GCCGCCGCGA AGATGTGGGA CAGCGTGGCG AGTGACCTGT TTTCGGCCGC 1920 

CAGTCGGTGG TCTGGGGTCT GACGACGGGA TCGTGGATAG GTTCGTCGGC 1980 

GTGGCGGCGG CCTCGCCGTA TGTGGCGTGG ATGAGCGTCA CCGCGGGGCA 204 0 

ACCGCCGCCC AGGTCCGGGT TGCTGCGGCG GCCTACGAGA CGGCGTATGG 2100 

CCCCCGCCGG TGATCGCCGA GAACCGTGCT GAACTGATGA TTCTGATAGC 2160 

TTGGGGCAAA ACACCCCGGC GATCGCGGTC AACGAGGCCG AATACGGGGA 2220 

CAAGACGCCG CCGCGATGTT TGGCTACGCC GCCACGGCGG CGACGGCGAC 2280 

CTGCCGTTCG AGGACGCCCC ACT GAT C AC C AACCCCGGCG GGCTCCTTGA 234 0 

GCGGTCGAGG AGGCCATCGA CACCGCCGCG GCGAACCAGT TGATGAACAA 24 00 

GCGCTGCAAC AACTGGCCCA GCCCACGAAA AGCATCTGGC CGTTCGACCA 24 60 

CTCTGGAAAG CCATCTCGCC GCATCTGTCG CCGCTCAGCA ACATCGTGTC 2520 

AACCACGTGT CGATGACCAA CTCGGGTGTG TCGATGGCCA GCACCTTGCA 2580 

AAGGGCTTTG CTCCGGCGGC GGCTCAGGCC GTGGAAACCG CGGCGCAAAA 264 0 

GCGATGAGCT CGCTGGGCAG CCAGCTGGGT TCGTCGCTGG GTTCTTCGGG 2700 

GGGGTGGCCG CCAACTTGGG TCGGGCGGCC TCGGTCGGTT CGTTGTCGGT 27 60 

TGGGCCGCGG CCAACCAGGC GGTCACCCCG GCGGCGCGGG CGCTGCCGCT 2820 

ACCAGCGCCG CCCAAACCGC CCCCGGACAC ATGCTGGGCG GGCTACCGCT 28 80 

ACCAATAGCG GCGGCGGGTT CGGCGGGGTT AGCAATGCGT TGCGGATGCC 294 0 

TACGTAATGC CCCGTGTGCC CGCCGCCGGG TAACGCCGAT CCGCACGCAA 3000 
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TGCGGGCCCT CTATGCGGGC AGCGATC 
(2) INFORMATION FOR SEQ ID NO: 106: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 396 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 106: 

Val Val Asp Phe Gly Ala Leu Pro Pro Glu He Asn Ser Ala Arg Met 
15 10 15/ 

Tvr Ala Gly Pro Gly Ser Ala Ser Leu Val Ala Ala Ala Lys Me/t Trp 
20 ~ 25 30 

Asp Ser Val Ala Ser Asp Leu Phe Ser Ala Ala Ser Ala Phe Gin Ser 
35 40 45 

Val Val Trp Gly Leu Thr Thr Gly Ser Trp He Gly Ser Ser Ala Gly 
50 55 60 

Leu Met Val Ala Ala Ala Ser Pro Tyr Val Ala Trp Met Ser Val Thr 
65 70 75 80 

Ala Gly Gin Ala Glu Leu Thr Ala Ala Gin Val Arg Val Ala Ala Ala 
85 90 95 

Ala Tyr Glu Thr Ala Tyr Gly Leu Thr Val Pro Pro Pro Val He Ala 
100 105 HO 

Glu Asn Arg Ala Glu Leu Met He Leu He Ala Thr Asn Leu Leu Gly 
115 120 125 

Gin Asn Thr Pro Ala He Ala Val Asn Glu Ala Glu Tyr Gly Glu Met 
130 135 140 

Trp Ala Gin Asp Ala Ala Ala Met Phe Gly Tyr Ala Ala Thr Ala Ala 
145 150 155 160 

Thr Ala Thr Glu Ala Leu Leu Pro Phe Glu Asp Ala Pro Leu He Thr 
165 170 175 

Asn Pro Gly Gly Leu Leu Glu Gin Ala Val Ala Val Glu Glu Ala He 
180 185 * 190 

Asp Thr Ala Ala Ala Asn Gin Leu Met Asn Asn Val Pro Gin Ala Leu 
195 200 205 
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Gin Gin Leu Ala Gin Pro Thr Lys Ser He Trp Pro Phe Asp Gin Leu 
210 215 220 

Ser Glu Leu Trp Lys Ala He Ser Pro His Leu Ser Pro Leu Ser Asn 
225 230 235 240 

He Val Ser Met Leu Asn Asn His Val Ser Met Thr Asn Ser Gly Val 
245 250 255 

Ser Met Ala Ser Thr Leu His Ser Met Leu Lys Gly Phe Ala Pro Ala 
260 265 270 

Ala Ala Gin Ala Val Glu Thr Ala Ala Gin Asn Gly Val Gin Ala Met 
275 280 285 

Ser Ser Leu Gly Ser Gin Leu Gly Ser Ser Leu Gly Ser Ser Gly Leu 
290 295 300 ( ; 

// 

Gly Ala Gly Val Ala Ala Asn Leu Gly Arg Ala Ala Ser Val Gly Ser 
305 ' 310 315 320 . 

Leu Ser Val Pro Gin Ala Trp Ala Ala Ala Asn Gin Ala Val Thr Pro 
325. 330 335 

Ala Ala Arg Ala Leu Pro Leu Thr Ser Leu Thr Ser Ala Ala Gin Thr 
340 345 350 

Ala Pro Gly His Met Leu Gly Gly Leu Pro Leu Gly Gin Leu Thr Asn 
355 360 365 

Ser Gly Gly Gly Phe Gly Gly Val Ser Asn Ala Leu Arg Met Pro Pro 
370 375 380 

Arg Ala Tyr Val Met Pro Arg Val Pro Ala Ala Gly 
385 390 395 

(2) INFORMATION FOR SEQ ID NO: 107: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1616 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 107: 
CATCGGAGGG AGTGATCACC ATGCTGTGGC ACGCAATGCC ACCGGAGTAA ATACCGCACG 
GCTGATGGCC GGCGCGGGTC CGGCTCCAAT GCTTGCGGCG GCCGCGGGAT GGCAGACGCT 
TTCGGCGGCT CTGGACGCTC AGGCCGTCGA GTTGACCGCG CGCCTGAACT CTCTGGGAGA 
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AGCCTGGACT GGAGGTGGCA GCGACAAGGC GCTTGCGGCT GCAACGCCGA TGGTGGTCTG 24 0 

G C T ACAAACC GCGTCAACAC AGGCCAAGAC CCGTGCGATG CAGGCGACGG CGCAAGCCGC 300 

GGCATACACC CAGGCCATGG CCACGACGCC GTCGCTGCCG GAGATCGCCG CCAACCACAT 360 

CACCCAGGCC GTCCTTACGG CCACCAACTT CTTCGGTATC AACACGATCC CGATCGCGTT 4 20 

GACCGAGATG GATTATTTCA TCCGTATGTG GAACCAGGCA GCCCTGGCAA TGGAGGTCTA 4 80 

CCAGGCCGAG ACCGCGGTTA ACACGCTTTT CGAGAAGCTC GAGCCGATGG CGTCGATCCT 54 0 

TGATCCCGGC GCGAGCCAGA GCACGACGAA CCCGATCTTC GGAATGCCCT CCCCTGGCAG 600 

CTCAACACCG GTTGGCCAGT TGCCGCCGGC GGCTACCCAG ACCCTCGGCC AACTGGGTGA 660 

GATGAGCGGC CCGATGCAGC AGCTGACCCA GCCGCTGCAG CAGGTGACGT CGTTGTTCAG 720 

CCAGGTGGGC GGCACCGGCG GCGGCAACCC AGCCGACGAG GAAGCCGCGC AGATGGGCCT 780 

GCTCGGCACC AGTCCGCTGT CGAACCATCC GCTGGCTGGT GGATCAGGCC CCAGCGCGGG 840 

CGCGGGCCTG CTGCGCGCGG AGTCGCTACC TGGCGCAGGT GGGTCGTTGA CCCGCACGCC 900 
GCTGATGTCT CAGCTGATCG AAAAGCCGGT TGCCCCCTCG GTGATGCCGG CGGCTGCTGC 960 

CGGATCGTCG GCGACGGGTG GCGCCGCTCC GGTGGGTGCG GGAGCGATGG GCCAGGGTGC 1020 

GCAATCCGGC GGCTCCACCA GGCCGGGTCT GGTCGCGCCG GCACCGCTCG CGCAGGAGCG 1080 

TGAAGAAGAC GACGAGGACG ACTGGGACGA AG AG G ACG AC TGGTGAGCTC CCGTAATGAC 1140 

AACAGACTTC CCGGCCACCC GGGCCGGAAG ACTTGCCAAC ATTTTGGCGA GGAAGGTAAA 1200 

GAGAGAAAGT AGTCCAGCAT GGCAGAGATG AAGACCGATG CCGCTACCCT CGCGCAGGAG 12 60 

GCAGGTAATT TCGAGCGGAT CTCCGGCGAC CTGAAAACCC AGATCGACCA GGTGGAGTCG 1320 

ACGGCAGGTT CGTTGCAGGG CCAGTGGCGC GGCGCGGCGG GGACGGCCGC CCAGGCCGCG 1380 

GTGGTGCGCT TCCAAGAAGC AGCCAATAAG CAGAAGCAGG AACTCGACGA GAT CTCGACG 14 4 0 

AATATTCGTC AGGCCGGCGT CCAATACTCG AGGGCCGACG AGGAGCAGCA GCAGGCGCTG 1500 

TCCTCGCAAA TGGGCTTCTG ACCCGCTAAT ACGAAAAGAA ACGGAGCAAA AACATGACAG 1560 

AGCAGCAGTG GAATTTCGCG GGTATCGAGG CCGCGGCAAG CGCAATCCAG GGAAAT 1616 
(2) INFORMATION FOR SEQ ID NO: 108 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 432 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
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(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 108: 

CTAGTGGATG GGACCATGGC CATTTTCTGC AGTCTCACTG CCTTCTGTGT TGACATTTTG 60 

GCACGCCGGC GGAAACGAAG CACTGGGGTC GAAGAACGGC TGCGCTGCCA TATCGTCCGG 120 

AGCTTCCATA CCTTCGTGCG GCCGGAAGAG CTTGTCGTAG TCGGCCGCCA TGACAACCTC 180 

TCAGAGTGCG CTCAAACGTA TAAACACGAG AAAGGGCGAG ACCGACGGAA GGTCGAACTC 24 0 

GCCCGATCCC GTGTTTCGCT ATTCTACGCG AACTCGGCGT TGCCCTATGC GAACATCCCA 300 

GTGACGTTGC CTTCGGTCGA AGCCATTGCC TGACCGGCTT CGCTGATCGT CCGCGCCAGG 360 

TTCTGCAGCG CGTTGTTCAG CTCGGTAGCC GTGGCGTCCC ATTTTTGCTG GACACCCTGG 420 

TACGCCTCCG AA 432 
(2) INFORMATION FOR SEQ ID NO: 109: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 368 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 109: 

Met Leu Trp His Ala Met Pro Pro Glu Xaa Asn Thr Ala Arg Leu Met 
1 5 10 15 

Ala Gly Ala Gly Pro Ala Pro Met Leu Ala Ala Ala Ala Gly Trp Gin 
20 25 30 

Thr Leu Ser Ala Ala Leu Asp Ala Gin Ala Val Glu Leu Thr Ala Arg 
35 40 45 

Leu Asn Ser Leu Gly Glu Ala Trp Thr Gly Gly Gly Ser Asp Lys Ala 
50 55 60 

Leu Ala Ala Ala Thr Pro Met Val Val Trp Leu Gin Thr Ala Ser Thr 

65 70 75 80 

Gin Ala Lys Thr Arg Ala Met Gin Ala Thr Ala Gin Ala Ala Ala Tyr 
85 90 95 
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Thr Gin Ala Met Ala Thr Thr Pro Ser Leu Pro Glu He Ala Ala Asn 
100 105 HO 

His He Thr Gin Ala Val Leu Thr Ala Thr Asn Phe Phe Gly He Asn 
115 120 125 

Thr He Pro He Ala Leu Thr Glu Met Asp Tyr Phe He Arg Met Trp 
130 135 140 

Asn Gin Ala Ala Leu Ala Met Glu Val Tyr Gin Ala Glu Thr Ala Val 
145 150 155 160 

Asn Thr Leu Phe Glu Lys Leu Glu Pro Met Ala Ser He Leu Asp Pro 
165 170 175 

Gly Ala Ser Gin Ser Thr Thr Asn Pro He Phe Gly Met Pro Ser Pro 
180 185 190 y 

Gly Ser Ser Thr Pro Val Gly Gin Leu Pro Pro Ala Ala Thr Gin Thr 
195 200 205 

Leu Gly Gin Leu Gly Glu Met Ser Gly Pro Met Gin Gin Leu Thr Gin 
210 215 220 

Pro Leu Gin Gin Val Thr Ser Leu Phe Ser Gin Val Gly Gly Thr Gly 
225 230 235 240 

Gly Gly Asn Pro Ala Asp Glu Glu Ala Ala Gin Met Gly Leu Leu Gly 
245 250 255 

Thr Ser Pro Leu Ser Asn His Pro Leu Ala Gly Gly Ser Gly Pro Ser 
260 265 270 

Ala Gly Ala Gly Leu Leu Arg Ala Glu Ser Leu Pro Gly Ala Gly Gly 
275 280 285 

Ser Leu Thr Arg Thr Pro Leu Met Ser Gin Leu He Glu Lys Pro Val 
290 295 300 

Ala Pro Ser Val Met Pro Ala Ala Ala Ala Gly Ser Ser Ala Thr Gly 
305 310 315 320 

Gly Ala Ala Pro Val Gly Ala Gly Ala Met Gly Gin Gly Ala Gin Ser 
325 330 335 

Gly Gly Ser Thr Arg Pro Gly Leu Val Ala Pro Ala Pro Leu Ala Gin 
340 345 350 

Glu Arg Glu Glu Asp Asp Glu Asp Asp Trp Asp Glu Glu Asp Asp Trp 
355 360 365 



(2) INFORMATION FOR SEQ ID NO: 110: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 100 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

( D ) TOPOLOGY : linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 110: 

Met Ala Glu Met Lys Thr Asp Ala Ala Thr Leu Ala Gin Glu Ala Gly 
1 5 10 15 

Asn Phe Glu Arg lie Ser Gly Asp Leu Lys Thr Gin He Asp Gin Val 
20 25 30 

Glu Ser Thr Ala Gly Ser Leu Gin Gly Gin Trp Arg Gly Ala Ala Gly 
35 40 45 

Thr Ala Ala Gin Ala Ala Val Val Arg Phe Gin Glu Ala Ala Asn Lys 
50 55 60 

Gin Lys Gin Glu Leu Asp Glu He Ser Thr Asn He Arg Gin Ala Gly 
65 ' 70 75 . 80 

Val Gin Tyr Ser Arg Ala Asp Glu Glu Gin Gin Gin Ala Leu Ser Ser 
85 90 95 



Gin Met Gly Phe 
100 



(2) INFORMATION FOR SEQ ID NO: 111: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 396 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 111: 
GATCTCCGGC GACCTGAAAA CCCAGATCGA CCAGGTGGAG TCGACGGCAG GTTCGTTGCA 
GGGCCAGTGG CGCGGCGCGG CGGGGACGGC CGCCCAGGCC GCGGTGGTGC GCTTCCAAGA 
AGCAGCCAAT AAGCAGAAGC AGGAACTCGA CGAGATCTCG ACGAATATTC GTCAGGCCGG 
CGTCCAATAC TCGAGGGCCG ACGAGGAGCA GCAGCAGGCG CTGTCCTCGC AAATGGGCTT 
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CTGACCCGCT AATACGAAAA GAAACGGAGC AAAAACATGA CAGAGCAGCA GTGGAATTTC 
GCGGGTATCG AGGCCGCGGC AAGCGCAATC CAGGGAAATG TCACGTCCAT TCATTCCCTC 
CTTGACGAGG GGAAGCAGTC CCTGACCAAG CTCGCA 
(2) INFORMATION FOR SEQ ID NO: 112: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 80 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 112: 

He Ser Gly Asp Leu Lys Thr Gin He Asp Gin Val Glu Ser Thr Ala 
1 5 10 15 

Gly Ser Leu Gin Gly Gin Trp Arg Gly Ala Ala Gly Thr Ala Ala Gin 
20 25 30 

Ala Ala Val Val Arg Phe Gin Glu Ala Ala Asn Lys Gin Lys Gin Glu 
35 40 45 

Leu Asp Glu He Ser Thr Asn He Arg Gin Ala Gly Val Gin Tyr Ser 
50 55 60 

Arg Ala Asp Glu Glu Gin Gin Gin Ala Leu Ser Ser Gin Met Gly Phe 
65 70 75 so 



(2) INFORMATION FOR SEQ ID NO: 113: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 387 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 113: 
GTGGATCCCG ATCCCGTGTT TCGCTATTCT ACGCGAACTC GGCGTTGCCC TATGCGAACA 
TCCCAGTGAC GTTGCCTTCG GTCGAAGCCA TTGCCTGACC GGCTTCGCTG ATCGTCCGCG 
CCAGGTTCTG CAGCGCGTTG TTCAGCTCGG TAGCCGTGGC GTCCCATTTT TGCTGGACAC 
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CCTGGTACGC CTCCGAACCG CTACCGCCCC AGGCCGCTGC GAGCTTGGTC AGGGACTGCT 24 0 

TCCCCTCGTC AAGGAGGGAA TGAATGGACG TGACATTTCC CTGGATTGCG CTTGCCGCGG 300 

CCTCGATACC CGCGAAATTC CACTGCTGCT CTGTCATGTT TTTGCTCCGT TTCTTTTCGT 360 

ATTAGCGGGT CAGAAGCCCA TTTGCGA 387 
(2) INFORMATION FOR SEQ ID NO: 114: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 272 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 114: 



CGGCACGAGG 


ATCTCGGTTG 


GCCCAACGGC 


GCTGGCGAGG 


GCTCCGTTCC 


GGGGGCGAGC 


60 


TGCGCGCCGG 


ATGCTTCCTC 


TGCCCGCAGC 


CGCGCCTGGA 


TGGATGGACC 


AGTTGCTACC 


120 


TTCCCGACGT 


TTCGTTCGGT 


GTCTGTGCGA 


TAGCGGTGAC 


CCCGGCGCGC 


ACGTCGGGAG 


180 


TGTTGGGGGG 


CAGGCCGGGT 


CGGTGGTTCG 


GCCGGGGACG 


CAGACGGTCT 


GGACGGAACG 


240 


GGCGGGGGTT 


CGCCGATTGG 


CATCTTTGCC 


CA 






272 



(2) INFORMATION FOR SEQ ID NO: 115: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 20 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 115: 

Asp Pro Val Asp Ala Val He Asn Thr Thr Cys Asn Tyr Gly Gin Val 
! 5 10 15 

Val Ala Ala Leu 
20 

(2) INFORMATION FOR SEQ ID NO: 116: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 116: 

Ala Val Glu Ser Gly Met Leu Ala Leu Gly Thr Pro Ala Pro Ser 
1 5 10 15 

(2) INFORMATION FOR SEQ ID NO: 117: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 amino acids / 

(B) TYPE: amino acid / 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 117 : 

Ala Ala Met Lys Pro Arg Thr Gly Asp Gly Pro Leu Glu Ala Ala Lys 
1 5 10 15 

Glu Gly Arg 



(2) INFORMATION FOR SEQ ID NO: 118: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 118: 

Tyr Tyr Trp Cys Pro Gly Gin Pro Phe Asp Pro Ala Trp Gly Pro 



1 



5 



10 



15 



(2) INFORMATION FOR SEQ ID NO: 119: 



(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 14 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 119: 

Asp lie Gly Ser Glu Ser Thr Glu Asp Gin Gin Xaa Ala Val 
1 J 5 10 

(2) INFORMATION FOR SEQ ID NO: 120: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 120: 

Ala Glu Glu Ser He Ser Thr Xaa Glu Xaa He Val Pro 
15 10 

(2) INFORMATION FOR SEQ ID NO: 121: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 17 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 121: 

Asp Pro Glu Pro Ala Pro Pro Val Pro Thr Thr Ala Ala Ser Pro Pro 
1 5 10 15 

Ser 



(2) INFORMATION FOR SEQ ID NO: 122: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 amino acids 
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(B) TYPE: amino acid 

( C ) STRANDEDNESS : 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 122: 

Ala Pro Lys Thr Tyr Xaa Glu Glu Leu Lys Gly Thr Asp Thr Gly 
15 10 15 

(2) INFORMATION FOR SEQ ID NO: 123: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 123: 



Asp Pro Ala Ser Ala Pro Asp Val Pro Thr Ala Ala Gin Leu Thr Ser 
15 10 15 

Leu Leu Asn Ser Leu Ala Asp Pro Asn Val Ser Phe Ala Asn 
20 25 30 

(2) INFORMATION FOR SEQ ID NO: 124: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 124: 

Asp Pro Pro Asp Pro His Gin Xaa Asp Met Thr Lys Gly Tyr Tyr Pro 
1 5 10 15 

Gly Gly Arg Arg Xaa Phe 
20 

(2) INFORMATION FOR SEQ ID NO: 125: 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 125: 

Asp Pro Gly Tyr Thr Pro Gly 
1 5 

(2) INFORMATION FOR SEQ ID NO: 126: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : 

(D) TOPOLOGY: linear 



(ix) FEATURE: 

(D) OTHER INFORMATION: /note= "The Second Residue Can Be Either a 
Pro or Thr" 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO * 

Xaa Xaa Gly Phe Thr Gly Pro Gin Tyr 
1 5 10 

(2) INFORMATION FOR SEQ ID NO: 127: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 



(ix) FEATURE: 

(D) OTHER INFORMATION: /note= "The Third Residue Can Be Either a 

Gin or Leu" 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 127: 

Xaa Pro Xaa Val Thr Ala Tyr Ala Gly 
1 5 




INRDOCin: <WO 9B16645A2 I > 



WO 98/16645 



152 



PCT/US97/18214 



(2) INFORMATION FOR SEQ ID NO: 128: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 128: 

Xaa Xaa Xaa Glu Lys Pro Phe Leu Arg 
1 5 

(2) INFORMATION FOR SEQ ID NO: 12 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

( D) TOPOLOGY : linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 129: 

Xaa Asp Ser Glu Lys Ser Ala Thr He Lys Val Thr Asp Ala Ser 
1 1 5 10 .15 

(2) INFORMATION FOR SEQ ID NO: 130: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 130: 

Ala Gly Asp Thr Xaa He Tyr He Val Gly Asn Leu Thr Ala Asp 
15 10 15 

(2) INFORMATION FOR SEQ ID NO: 131: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 amino acids 
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(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 131: 

Ala Pro Glu Ser Gly Ala Gly Leu Gly Gly Thr Val Gin Ala Gly 
1 5 10 15 

(2) INFORMATION FOR SEQ ID NO: 132: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 132: 

Xaa Tyr He Ala Tyr Xaa Thr Thr Ala Gly He Val Pro Gly Lys He 
1 5 10 15 

Asn Val His Leu Val 
20 

(2) INFORMATION FOR SEQ ID NO: 133: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 882 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 133: 
GCAACGCTGT CGTGGCCTTT GCGGTGATCG GTTTCGCCTC GCTGGCGGTG GCGGTGGCGG 
TCACCATCCG ACCGACCGCG GCCTCAAAAC CGGTAGAGGG ACACCAAAAC GCCCAGCCAG 
GGAAGTTCAT GCCGTTGTTG CCGACGCAAC AGCAGGCGCC GGTCCCGCCG CCTCCGCCCG 
ATGATCCCAC CGCTGGATTC CAGGGCGGCA CCATTCCGGC TGTACAGAAC GTGGTGCCGC 
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GGCCGGGTAC CTCACCCGGG GTGGGTGGGA CGCCGGCTTC GCCTGCGCCG GAAGCGCCGG 300 

CCGTGCCCGG TGTTGTGCCT GCCCCGGTGC CAATCCCGGT CCCGATCATC ATTCCCCCGT 360 

TCCCGGGTTG GCAGCCTGGA ATGCCGACCA TCCCCACCGC ACCGCCGACG ACGCCGGTGA 4 20. 

CCACGTCGGC GACGACGCCG CCGACCACGC CGCCGACCAC GCCGGTGACC ACGCCGCCAA 4 80 

CGACGCCGCC GACCACGCCG GTGACCACGC CGCCAACGAC GCCGCCGACC ACGCCGGTGA 54 0 

CCACGCCACC AACGACCGTC GCCCCGACGA CCGTCGCCCC GACGACGGTC GCTCCGACCA 600 

CCGTCGCCCC GACCACGGTC GCTCCAGCCA CCGCCACGCC GACGACCGTC GCTCCGCAGC 660 

CGACGCAGCA GCCCACGCAA CAACCAACCC AACAGATGCC AACCCAGCAG CAGACCGTGG 720 

CCCCGCAGAC GGTGGCGCCG GCTCCGCAGC CGCCGTCCGG TGGCCGCAAC GGCAGCGGCG 7 80 

GGGGCGACTT ATTCGGCGGG TTCTGATCAC GGTCGCGGCT TCACTACGGT CGGAGGACAT 84 0 

GGCCGGTGAT GCGGTGACGG TGGTGCTGCC CTGTCTCAAC GA 882 
(2) INFORMATION FOR SEQ ID NO: 134 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 815 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 134: 

CCATCAACCA ACCGCTCGCG CCGCCCGCGC CGCCGGATCC GCCGTCGCCG CCACGCCCGC 60 

CGGTGCCTCC GGTGCCCCCG TTGCCGCCGT CGCCGCCGTC GCCGCCGACC GGCTGGGTGC 120 

CTAGGGCGCT GTTACCGCCC TGGTTGGCGG GGACGCCGCC GGCACCACCG GTACCGCCGA 180 

TGGCGCCGTT GCCGCCGGCG GCACCGTTGC CACCGTTGCC ACCGTTGCCA CCGTTGCCGA 24 0 

CCAGCCACCC GCCGCGACCA CCGGCACCGC CGGCGCCGCC CGCACCGCCG GCGTGCCCGT 300 

TCGTGCCCGT ACCGCCGGCA CCGCCGTTGC CGCCGTCACC GCCGACGGAA CTACCGGCGG 360 

ACGCGGCCTG CCCGCCGGCG CCGCCCGCAC CGCCATTGGC ACCGCCGTCA CCGCCGGCTG 4 20 

GGAGTGCCGC GATTAGGGCA CTGACCGGCG CAACCAGCGC AAGTACTCTC GGTCACCGAG 4 80 

CACTTCCAGA CGACACCACA GCACGGGGTT GTCGGCGGAC TGGGTGAAAT GGCAGCCGAT 54 0 
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AGCGGCTAGC TGTCGGCTGC GGTCAACCTC GATCATGATG TCGAGGTGAC CGTGACCGCG 600 

CCCCCCGAAG GAGGCGCTGA ACTCGGCGTT GAGCCGATCG GCGATCGGTT GGGGCAGTGC 660 

CCAGGCCAAT ACGGGGATAC CGGGTGTCNA AGCCGCCGCG AGCGCAGCTT CGGTTGCGCG 720 

ACNGTGGTCG GGGTGGCCTG TTACGCCGTT GTCNTCGAAC ACGAGTAGCA GGTCTGCTCC 780 

GGCGAGGGCA TCCACCACGC GTTGCGTCAG CTCGT 815 
(2) INFORMATION FOR SEQ ID NO: 135: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1152 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single j- 

(D) TOPOLOGY: linear U 

(ii) MOLECULE TYPE: DNA (genomic) 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 135: 



ACCAGCCGCC 


GGCTGAGGTC 


TCAGATCAGA 


GAGTCTCCGG 


ACTCACCGGG 


GCGGTTCAGC 


60 


CTTCTCCCAG 


AACAACTGCT 


GAAGATCCTC 


GCCCGCGAAA 


CAGGCGCTGA 


TTTGACGCTC 


120 


TATGACCGGT 


TGAACGACGA 


GATCATCCGG 


CAGATTGATA 


TGGCACCGCT 


GGGCTAACAG 


180 


GTGCGCAAGA 


TGGTGCAGCT 


GTATGTCTCG 


GACTCCGTGT 


CGCGGATCAG 


CTTTGCCGAC 


240 


GGCCGGGTGA 


TCGTGTGGAG 


CGAGGAGCTC 


GGCGAGAGCC 


AGTATCCGAT 


CGAGACGCTG 


300 


GACGGCATCA 


CGCTGTTTGG 


GCGGCCGACG 


ATGACAACGC 


CCTTCATCGT 


TGAGATGCTC 


360 


AAGCGTGAGC 


GCGACATCCA 


GCTCTTCACG 


ACCGACGGCC 


ACTACCAGGG 


CCGGATCTCA 


420 


ACACCCGACG 


TGTCATACGC 


GCCGCGGCTC 


CGTCAGCAAG 


TTCACCGCAC 


CGACGATCCT 


480 


GCGTTCTGCC 


TGTCGTTAAG 


CAAGCGGATC 


GTGTCGAGGA 


AGATCCTGAA 


TCAGCAGGCC 


540 


TTGATTCGGG 


CACACACGTC 


GGGGCAAGAC 


GTTGCTGAGA 


GCATCCGCAC 


GATGAAGCAC 


600 


TCGCTGGCCT 


GGGTCGATCG 


ATCGGGCTCC 


CTGGCGGAGT 


TGAACGGGTT 


CGAGGGAAAT 


660 


GCCGCAAAGG 


CATACTTCAC 


CGCGCTGGGG 


CATCTCGTCC 


CGCAGGAGTT 


CGCATTCCAG 


720 


GGCCGCTCGA 


CTCGGCCGCC 


GTTGGACGCC 


TTCAACTCGA 


TGGTCAGCCT 


CGGCTATTCG 


780 


CTGCTGTACA 


AGAACATCAT 


AGGGGCGATC 


GAGCGTCACA 


GCCTGAACGC 


GTATATCGGT 


840 


TTCCTACACC 


AGGATTCACG 


AGGGCACGCA 


ACGTCTCGTG 


CCGAATTCGG 


CACGAGCTCC 


900 
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GCTGAAACCG CTGGCCGGCT GCTCAGTGCC CGTACGTAAT CCGCTGCGCC CAGGCCGGCC 960 

CGCCGGCCGA ATACCAGCAG ATCGGACAGC GAATTGCCGC CCAGCCGGTT GGAGCCGTGC 1020 

ATACCGCCGG CACACTCACC GGCAGCGAAC AGGCCTGGCA CCGTGGCGGC GCCGGTGTCC 1080 

GCGTCTACTT CGACACCGCC CATCACGTAG TGACACGTCG GCCCGACTTC CATTGCCTGC 1140 
GTTCGGCACG AG 

(2) INFORMATION FOR SEQ ID NO: 136: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 655 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 



1152 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 136: 

CTCGTGCCGA TTCGGCAGGG TGTACTTGCC GGTGGTGTAN GCCGCATGAG TGCCGACGAC 60 

CAGCAATGCG GCAACAGCAC GGATCCCGGT CAACGACGCC ACCCGGTCCA CGTGGGCGAT 120 

CCGCTCGAGT CCGCCCTGGG CGGCTCTTTC CTTGGGCAGG GTCATCCGAC GTGTTTCCGC 180 

CGTGGTTTGC CGCCATTATG CCGGCGCGCC GCGTCGGGCG GCCGGTATGG CCGAANGTCG 24 0 

ATCAGCACAC CCGAGATACG GGTCTGTGCA AGCTTTTTGA GCGTCGCGCG GGGCAGCTTC 300 

GCCGGCAATT CTACTAGCGA GAAGTCTGGC CCGATACGGA TCTGACCGAA GTCGCTGCGG 360 

TGCAGCCCAC CCTCATTGGC GATGGCGCCG ACGATGGCGC CTGGACCGAT CTTGTGCCGC 4 20 

TTGCCGACGG CGACGCGGTA GGTGGTCAAG TCCGGTCTAC GCTTGGGCCT TTGCGGACGG 48 0 

TCCCGACGCT GGTCGCGGTT GCGCCGCGAA AGCGGCGGGT CGGGTGCCAT CAGGAATGCC 54 0 

TCACCGCCGC GGCACTGCAC GGCCAGTGCC GCGGCGATGT CAGCCATCGG GACATCATGC 600 

TCGCGTTCAT ACTCCTCGAC CAGTCGGCGG AACAGCTCGA TTCCCGGACC GCCCA 655 
(2) INFORMATION FOR SEQ ID NO: 137: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 267 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 137: 

Asn Ala Val Val Ala Phe Ala Val He Gly Phe Ala Ser Leu Ala Val 
1 5 10 15 

Ala Val Ma Val Thr He Arg Pro Thr Ala Ala Ser Lys Pro Val Glu 
20 25 30 

Gly His Gin Asn Ala Gin Pro Gly Lys Phe Met Pro Leu Leu Pro Thr 
35 40 45 

Gin Gin Gin Ala Pro Val Pro Pro Pro Pro Pro Asp Asp Pro Thr Ala 
50 55 60 

Gly Phe Gin Gly Gly Thr He Pro Ala Val Gin Asn Val Val Pro Arg 
65 70 75 80 

Pro Gly Thr Ser Pro Gly Val Gly Gly Thr Pro Ala Ser Pro Ala Pro 
85 90 95 

Glu Ala Pro Ala Val Pro Gly Val Val Pro Ala Pro Val Pro He Pro 
100 105 HO 

Val Pro He He He Pro Pro Phe Pro Gly Trp Gin Pro Gly Met Pro 
115 120 125 

Thr He Pro Thr Ala Pro Pro Thr Thr Pro Val Thr Thr Ser Ala Thr 
130 135 140 

Thr Pro Pro Thr Thr Pro Pro Thr Thr Pro Val Thr Thr Pro Pro Thr 
145 150 155 160 

Thr Pro Pro Thr Thr Pro Val Thr Thr Pro Pro Thr Thr Pro Pro Thr 
165 170 175 

Thr Pro Val Thr Thr Pro Pro Thr Thr Val Ala Pro Thr Thr Val Ala 
180 185 190 

Pro Thr Thr Val Ala Pro Thr Thr Val Ala Pro Thr Thr Val Ala Pro 
195 200 205 

Ala Thr Ala Thr Pro Thr Thr Val Ala Pro Gin Pro Thr Gin Gin Pro 
210 215 220 

Thr Gin Gin Pro Thr Gin Gin Met Pro Thr Gin Gin Gin Thr Val Ala 
225 230 235 240 

Pro Gin Thr Val Ala Pro Ala Pro Gin Pro Pro Ser Gly Gly Arg Asn 
245 250 255 
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Gly Ser Gly Gly Gly Asp Leu Phe Gly Gly Phe 
260 265 

(2) INFORMATION FOR SEQ ID NO: 138: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 174 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



7 



Pro Pro 
30 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:138: 

lie Asn Gin Pro Leu Ala Pro Pro Ala Pro Pro Asp Pro Pro sL Pro 
5 io 15 

Pro Arg Pro Pro Val Pro Pro Val Pro Pro Leu Pro Pro Ser 
20 25 3 0 

Ser Pro Pro Thr Gly Trp Val Pro Arg Ala Leu Leu Pro Pro Trp Leu 

40 45 

Ala Gly Thr Pro Pro Ala Pro Pro Val Pro Pro Met Ala Pro Leu Pro 

55 60 

Pro Ala Ala Pro Leu Pro Pro Leu Pro Pro Leu Pro Pro Leu Pro Thr 

70 7 5 go 

Ser His Pro Pro Arg Pro Pro Ala Pro Pro Ala Pro Pro Ala Pro 



85 90 



Pro 
95 



Ala Cys Pro Phe Val Pro Val Pro Pro Ala Pro Pro Leu Pro Pro Ser 
100 105 110 

Pro Pro Thr Glu Leu Pro Ala Asp Ala Ala Cys Pro Pro Ala Pro 



115 120 



Pro 
125 



Ala Pro Pro Leu Ala Pro Pro Ser Pro Pro Ala Gly Ser Ala Ala He 
130 135 i4 0 

Arg Ala Leu Thr Gly Ala Thr Ser Ala Ser Thr Leu Gly His Arg Ala 

150 155 i 60 

Leu Pro Asp Asp Thr Thr Ala Arg Gly Cys Arg Arg Thr Gly 
165 170 

(2) INFORMATION FOR SEQ ID NO: 139: 
(i) SEQUENCE CHARACTERISTICS: 
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(A) LENGTH: 35 amino acids 

(B) TYPE : amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 139: 

Gin Pro Pro Ala Glu Val Ser Asp Gin Arg Val Ser Gly Leu Thr Gly 
1 5 10 15 

Ala Val Gin Pro Ser Pro Arg Thr Thr Ala Glu Asp Pro Arg Pro Arg 
20 25 30 

/ 

Asn Arg Arg 
35 

(2) INFORMATION FOR SEQ ID NO: 14 0: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 104 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14 0: 

Arg Ala Asp Ser Ala Gly Cys Thr Cys Arg Trp Cys Xaa Pro His Glu 
1 5 10 15 

Cys Arg Arg Pro Ala Met Arg Gin Gin His Gly Ser Arg Ser Thr Thr 
20 25 30 

Pro Pro Gly Pro Arg Gly Arg Ser Ala Arg Val Arg Pro Gly Arg Leu 
35 40 45 

Phe Pro Trp Ala Gly Ser Ser Asp Val Phe Pro Pro Trp Phe Ala Ala 
50 * 55 60 

He Met Pro Ala Arg Arg Val Gly Arg Pro Val Trp Pro Xaa Val Asp 
65 70 75 80 

Gin His Thr Arg Asp Thr Gly Leu Cys Lys Leu Phe Glu Arg Arg Ala 
85 90 95 

Gly Gin Leu Arg Arg Gin Phe Tyr 
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100 

(2) INFORMATION FOR SEQ ID NO: 141 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 53 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = "PCR primer" 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobacterium tuberculosis 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 141: j 
GGATCCATAT GGGCCATCAT CATCATCATC ACGTGATCGA CATCATCGGG ACC 53 
(2) INFORMATION FOR SEQ ID NO: 142: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 42 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = "PCR Primer" 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobacterium tuberculosis 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 142: 
CCTGAATTCA GGCCTCGGTT GCGCCGGCCT CATCTTGAAC GA 
(2) INFORMATION FOR SEQ ID NO: 143: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = "PCR Primer" 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobacterium tuberculosis 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 143: 
GGATCCTGCA GGCTCGAAAC CACCGAGCGG T 31 
(2) INFORMATION FOR SEQ ID NO: 14 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = "PCR primer" 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobacterium tuberculosis 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14 4: 
CTCTGAATTC AGCGCTGGAA ATCGTCGCGA T 
(2) INFORMATION FOR SEQ ID NO: 145: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc = "PCR primer" 



(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobacterium tuberculosis 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 145: 
GGATCCAGCG CTGAGATGAA GACCGATGCC GCT 
(2) INFORMATION FOR SEQ ID NO:146: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: other nucleic acid 

(A) DESCRIPTION: /desc - "PCR primer" 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobacterium tuberculosis 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14 6: 
GAGAGAATTC TCAGAAGCCC ATTTGCGAGG ACA 33 
(2) INFORMATION FOR SEQ ID NO: 147: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1993 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Mycobacterium tuberculosis 

(ix) FEATURE: 

(A) NAME/KEY: CDS 

(B) LOCATION: 152.. 1273 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 147: 

TGTTCTTCGA CGGCAGGCTG GTGGAGGAAG GGCCCACCGA ACAGCTGTTC TCCTCGCCGA 60 

AGCATGCGGA AACCGCCCGA TACGTCGCCG GACTGTCGGG GGACGTCAAG GACGCCAAGC 120 

GCGGAAATTG AAGAGCACAG AAAGGTATGG C GTG AAA ATT CGT TTG CAT ACG 17 2 

Val Lys lie Arg Leu His Thr 
1 5 

CTG TTG GCC GTG TTG ACC GCT GCG CCG CTG CTG CTA GCA GCG GCG GGC 220 
Leu Leu Ala Val Leu Thr Ala Ala Pro Leu Leu Leu Ala Ala Ala Gly 
10 15 20 

TGT GGC TCG AAA CCA CCG AGC GGT TCG CCT GAA ACG GGC GCC GGC GCC 268 
Cys Gly Ser Lys Pro Pro Ser Gly Ser Pro Glu Thr Gly Ala Gly Ala 
25 " 30 35 

GGT ACT GTC GCG ACT ACC CCC GCG TCG TCG CCG GTG ACG TTG GCG GAG 316 
Gly Thr Val Ala Thr Thr Pro Ala Ser Ser Pro Val Thr Leu Ala Glu 
40 45 50 55 

ACC GGT AGC ACG CTG CTC TAC CCG CTG TTC AAC CTG TGG GGT CCG GCC 3 64 

Thr Gly Ser Thr Leu Leu Tyr Pro Leu Phe Asn Leu Trp Gly Pro Ala 
60 65 70 
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TTT CAC GAG AGG TAT CCG AAC GTC ACG ATC ACC GCT CAG GGC ACC GGT 412 
Phe His Glu Arg Tyr Pro Asn Val Thr lie Thr Ala Gin Gly Thr Gly 
75 80 85 

TCT GGT GCC GGG ATC GCG CAG GCC GCC GCC GGG ACG GTC AAC ATT GGG 4 60 

Ser Gly Ala Gly He Ala Gin Ala Ala Ala Gly Thr Val Asn He Gly 
go 95 100 

GCC TCC GAC GCC TAT CTG TCG GAA GGT GAT ATG GCC GCG CAC AAG GGG 508 
Ala Ser Asp Ala Tyr Leu Ser Glu Gly Asp Met Ala Ala His Lys Gly 
105 HO H5 

CTG ATG AAC ATC GCG CTA GCC ATC TCC GCT CAG CAG GTC AAC TAC AAC 556 
Leu Met Asn He Ala Leu Ala He Ser Ala Gin Gin Val Asn Tyr Asn 
120 125 130 135 

CTG CCC GGA GTG AGC GAG CAC CTC AAG CTG AAC GGA AAA GTC CTG GCG 604 
Leu Pro Gly Val Ser Glu His Leu Lys Leu Asn Gly Lys Val Leu Ala 
140 145 150 

GCC ATG TAC CAG GGC ACC ATC AAA ACC TGG GAC GAC CCG CAG ATC GCT 652 
Ala Met Tyr Gin Gly Thr He Lys Thr Trp Asp Asp Pro Gin He Ala 
155 160 165 

GCG CTC AAC CCC GGC GTG AAC CTG CCC GGC ACC GCG GTA GTT CCG CTG 700 
Ala Leu Asn Pro Gly Val Asn Leu Pro Gly Thr Ala Val Val Pro Leu 
170 175 180 

CAC CGC TCC GAC GGG TCC GGT GAC ACC TTC TTG TTC ACC CAG TAC CTG 7 48 

His Arg Ser Asp Gly Ser Gly Asp Thr Phe Leu Phe Thr Gin Tyr Leu 
185 190 195 

TCC AAG CAA GAT CCC GAG GGC TGG GGC AAG TCG CCC GGC TTC GGC ACC 7 96 

Ser Lys Gin Asp Pro Glu Gly Trp Gly Lys Ser Pro Gly Phe Gly Thr 
200 205 210 215 

ACC GTC GAC TTC CCG GCG GTG CCG GGT GCG CTG GGT GAG AAC GGC AAC 84 4 

Thr Val Asp Phe Pro Ala Val Pro Gly Ala Leu Gly Glu Asn Gly Asn 
220 225 230 

GGC GGC ATG GTG ACC GGT TGC GCC GAG ACA CCG GGC TGC GTG GCC TAT 8 92 

Gly Gly Met Val Thr Gly Cys Ala Glu Thr Pro Gly Cys Val Ala Tyr 
235 240 245 

ATC GGC ATC AGC TTC CTC GAC CAG GCC AGT CAA CGG GGA CTC GGC GAG 94 0 

He Gly He Ser Phe Leu Asp Gin Ala Ser Gin Arg Gly Leu Gly Glu 
250 255 260 

GCC CAA CTA GGC AAT AGC TCT GGC AAT TTC TTG TTG CCC GAC GCG CAA 988 
Ala Gin Leu Gly Asn Ser Ser Gly Asn Phe Leu Leu Pro Asp Ala Gin 
265 270 275 

AGC ATT CAG GCC GCG GCG GCT GGC TTC GCA TCG AAA ACC CCG GCG AAC 1036 
Ser He Gin Ala Ala Ala Ala Gly Phe Ala Ser Lys Thr Pro Ala Asn 
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280 285 290 



295 



CAG GCG ATT TCG ATG ATC GAC GGG CCC GCC CCG GAC GGC TAC CCG ATC 
Gin Ala He Ser Met He Asp Gly Pro Ala Pro Asp Gly Tyr Pro He 
300 305 * 310 

ATC AAC TAC GAG TAC GCC ATC GTC AAC AAC CGG CAA AAG GAC GCC GCC 
He Asn Tyr Glu Tyr Ala He Val Asn Asn Arg Gin Lys Asp Ala Ala 
315 320 325 

ACC GCG CAG ACC TTG CAG GCA TTT CTG CAC TGG GCG ATC ACC GAC GGC 
Thr Ala Gin Thr Leu Gin Ala Phe Leu His Trp Ala He Thr Asp Gly 
330 335 340 

AAC AAG GCC TCG TTC CTC GAC CAG GTT CAT TTC CAG CCG CTG CCG CCC 
Asn Lys Ala Ser Phe Leu Asp Gin Val His Phe Gin Pro Leu Pro Pro 
345 350 355 



1084 



1132 



1180 



1228 



GCG GTG GTG AAG TTG TCT GAC GCG TTG ATC GCG ACG ATT TCC AGC / 
Ala Val Val Lys Leu Ser Asp Ala Leu He Ala Thr He Ser Ser / 
360 365 370 ** ' 

TAGCCTCGTT GACCACCACG CGACAGCAAC CTCCGTCGGG CCATCGGGCT GCTTTGCGGA 

GCATGCTGGC CCGTGCCGGT GAAGTCGGCC GCGCTGGCCC GGCCATCCGG TGGTTGGGTG 

GGATAGGTGC GGTGATCCCG CTGCTTGCGC TGGTCTTGGT GCTGGTGGTG CTGGTCATCG 

AGGCGATGGG TGCGATCAGG CTCAACGGGT TGCATTTCTT CACCGCCACC GAATGGAATC 

CAGGCAACAC CTACGGCGAA ACCGTTGTCA CCGACGCGTC GCCCATCCGG TCGGCGCCTA 

CTACGGGGCG TTGCCGCTGA TCGTCGGGAC GCTGGCGACC TCGGCAATCG CCCTGATCAT 

CGCGGTGCCG GTCTCTGTAG GAGCGGCGCT GGTGATCGTG GAACGGCTGC CGAAACGGTT 1693 

GGCCGAGGCT GTGGGAATAG TCCTGGAATT GCTCGCCGGA ATCCCCAGCG TGGTCGTCGG 1753 

TTTGTGGGGG GCAATGACGT TCGGGCCGTT CATCGCTCAT CACATCGCTC CGGTGATCGC 

TCACAACGCT CCCGATGTGC CGGTGCTGAA CTACTTGCGC GGCGACCCGG GCAACGGGGA 

GGGCATGTTG GTGTCCGGTC TGGTGTTGGC GGTGATGGTC GTTCCCATTA TCGCCACCAC 

CACTCATGAC CTGTTCCGGC AGGTGCCGGT GTTGCCCCGG GAGGGCGCGA TCGGGAATTC 



1273 

1333 
1393 
1453 
1513 
1573 
1633 



1813 
1873 
1933 
1993 



(2) INFORMATION FOR SEQ ID NO: 148: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 374 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 148: 

Val Lys lie Arg Leu His Thr Leu Leu Ala Val Leu Thr Ala Ala Pro 
! 5 10 15 

Leu Leu Leu Ala Ala Ala Gly Cys Gly Ser Lys Pro Pro Ser Gly Ser 
20 25 30 

Pro Glu Thr Gly Ala Gly Ala Gly Thr Val Ala Thr Thr Pro Ala Ser 
35 40 45 

Ser Pro Val Thr Leu Ala Glu Thr Gly Ser Thr Leu Leu Tyr Pro Leu 
50 55 60 

Phe Asn Leu Trp Gly Pro Ala Phe His Glu Arg Tyr Pro Asn Val Thr/ 
65 70 75 80 

lie Thr Ala Gin Gly Thr Gly Ser Gly Ala Gly lie Ala Gin Ala Ala 
85 90 95 

Ala Gly Thr Val Asn He Gly Ala Ser Asp Ala Tyr Leu Ser Glu Gly 
100 105 HO 

Asp Met Ala Ala His Lys Gly Leu Met Asn He Ala Leu Ala He Ser 
115 120 125 

Ala Gin Gin Val Asn Tyr Asn Leu Pro Gly Val Ser Glu His Leu Lys 
130 135 140 

Leu Asn Gly Lys Val Leu Ala Ala Met Tyr Gin Gly Thr He Lys Thr 
145 150 155 160 

Trp Asp Asp Pro Gin He Ala Ala Leu Asn Pro Gly Val Asn Leu Pro 
165 170 175 

Gly Thr Ala Val Val Pro Leu His Arg Ser Asp Gly Ser Gly Asp Thr 
180 185 190 

Phe Leu Phe Thr Gin Tyr Leu Ser Lys Gin Asp Pro Glu Gly Trp Gly 
195 200 205 

Lys Ser Pro Gly Phe Gly Thr Thr Val Asp Phe Pro Ala Val Pro Gly 
210 215 220 

Ala Leu Gly Glu Asn Gly Asn Gly Gly Met Val Thr Gly Cys Ala Glu 
225 J 230 235 240 

Thr Pro Gly Cys Val Ala Tyr He Gly He Ser Phe Leu Asp Gin Ala 
245 250 255 

Ser Gin Arg Gly Leu Gly Glu Ala Gin Leu Gly Asn Ser Ser Gly Asn 
260 - 265 270 
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Phe Leu Leu Pro Asp 
275 



Ala Ser Lys Thr Pro 
290 



Ala Pro Asp Gly Tyr 
305 

Asn Arg Gin Lys Asp 
325 

His Trp Ala lie Thr 
340 

His Phe Gin Pro Leu 
355 



He Ala Thr He Ser 
370 



Ala Gin Ser He Gin Ala 
280 

Ala Asn Gin Ala He Ser 
295 

Pro He He Asn Tyr Glu 
310 315 

Ala Ala Thr Ala Gin Thr 
330 

Asp Gly Asn Lys Ala Ser 
345 

Pro Pro Ala Val Val Lys 
360 

Ser 



Ala Ala Ala Gly Phe 
285 

Met He Asp Gly Pro 
300 

Tyr Ala He Val Asn 
320 

Leu Gin Ala Phe Leu 
335 

Phe Leu Asp Gin Val 
350 

Leu Ser Asp Ala Leu 
365 



(2) INFORMATION FOR SEQ ID NO: 14 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1993 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14 9: 



TGTTCTTCGA 


CGGCAGGCTG 


GTGGAGGAAG 


GGCCCACCGA 


ACAGCTGTTC 


TCCTCGCCGA 


60 


AGCATGCGGA 


AACCGCCCGA 


TACGTCGCCG 


GACTGTCGGG 


GGACGTCAAG 


GACGCCAAGC 


120 


GCGGAAATTG 


AAGAGCACAG 


AAAGGTATGG 


CGTGAAAATT 


CGTTTGCATA 


CGCTGTTGGC 


180 


CGTGTTGACC 


GCTGCGCCGC 


TGCTGCTAGC 


AGCGGCGGGC 


TGTGGCTCGA 


AACCACCGAG 


240 


CGGTTCGCCT 


GAAACGGGCG 


CCGGCGCCGG 


TACTGTCGCG 


ACTACCCCCG 


CGTCGTCGCC 


300 


GGTGACGTTG 


GCGGAGACCG 


GTAGCACGCT 


GCTCTACCCG 


CTGTTCAACC 


TGTGGGGTCC 


360 


GGCCTTTCAC 


GAGAGGTATC 


CGAACGTCAC 


GATCACCGCT 


CAGGGCACCG 


GTTCTGGTGC 


420 


CGGGATCGCG 


CAGGCCGCCG 


CCGGGACGGT 


CAACATTGGG 


GCCTCCGACG 


CCTATCTGTC 


480 


GGAAGGTGAT 


ATGGCCGCGC 


ACAAGGGGCT 


GATGAACATC 


GCGCTAGCCA 


TCTCCGCTCA 


540 


GCAGGTCAAC 


TACAACCTGC 


CCGGAGTGAG 


CGAGCACCTC 


AAGCTGAACG 


GAAAAGTCCT 


600 
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GGCGGCCATG 


TACCAGGGCA 


CCATCAAAAC 








Ov)U 


CCCCGGCGTG 


AACCTGCCCG 


GCACCGCGGT 


AG I I GGb»OI bj 






/ Z VJ 


TGACACCTTC 


TTGTTCACCC 


AGTACCTGTC 


T\ TV /""*/*"' TV 7\ TV T 1 

CAAGGAAGA 1 


r* ccc ix fz n n. c t 




Ton 


GCCCGGCTTC 


GGCACCACCG 


TCGACTTCCC 


GGCGGTGGGG 




cz r r(zi\rzixi\cf2r' 


O H U 


CAACGGCGGC 


ATGGTGACCG 


GTTGCGCCGA 


GACACCGGGC 




2i T 21 TCncC W T 


yuu 


CAGCTTCCTC 


GACCAGGCCA 


GTCAACGGGG 


ACT CGGCGAG 


GLrCCAAL, 1 M.bi 


rp7\ ATflPPTr 
KjK^,HJ\ X X 


q c n 


TGGCAATTTC 


TTGTTGCCCG 


ACGCGCAAAG 


CATTCAGGCC 


GCGGCGGG I b> 


Vav— i 1 UCjUA I G 


1020 


GAAAACCCCG 


GCGAACCAGG 


CGATTTCGAT 


GATCGACGGG 


CCCGCCCGGG 


7\ fT'* /~* T* 7\ r~* r^ 1 
ALbbb 1 AGGC 


lOo 0 


GATCATCAAC 


TACGAGTACG 


CCATCGTCAA 


CAACCGGCAA 


TV TV /""Z^ TV /"I /"*• ^» /"» f 

AAGGACGCCG 




114 0 


GACCTTGCAG 


GCATTTCTGC 


ACTGGGCGAT 


CACCGACGGC 


TV TV TV TV /~» /~» /"^ rr> 

AACAAGGCCT 


CG 1 I GGTCGA 


1 200 


CCAGGTTCAT 


TTCCAGCCGC 


TGCCGCCCGC 


GGTGGTGAAG 


TTGTCTGACG 


GG1 1GATCGC 


i o ^n 

12 60 


GACGATTTCC 


AGCTAGCCTC 


GTTGACCACC 


ACGCGACAGC 


AACCTCCGTC 


bbbbbAlLbb 




GCTGCTTTGC 


GGAGCATGCT 


GGCCCGTGCC 


GGTGAAGTCG 


GCCGCGCTGG 


CCCGGCCATC 


1380 


CGGTGGTTGG 


GTGGGATAGG 


TGCGGTGATC 


CCGCTGCTTG 


CGCTGGTCTT 


GGTGCTGGTG 


1 A A f\ 

14 40 


GTGCTGGTCA 


TCGAGGCGAT 


GGGTGCGATC 


AGGCTCAACG 


GGTTGCATTT 


CTTCACCGCC 


*i c r\ r*\ 

1500 


ACCGAATGGA 


ATCCAGGCAA 


CACCTACGGC 


GAAACCGTTG 


TCACCGACGC 


GTGGCCCATC 


15 60 


CGGTCGGCGC 


CTACTACGGG 


GCGTTGCCGC 


TGATCGTCGG 


GACGLTbbtb 


Av^v^ I L-VaGGAA 


icon 
i bz U 


TCGCCCTGAT 


CATCGCGGTG 


CCGGTCTCTG 


TAGGAGCGGC 


GC rbb 1 bAlL 




icon 
1 bo U 


TGCCGAAACG 


GTTGGCCGAG 




X f\KD X v— V— 1 oorV 


ATTGCTCGCC 


GGAATCCCCA 


1740 


GCGTGGTCGT 


CGCd 1 1 JL U I OVj 






GTTCATCGCT 


CAT C AC AT CG 


1800 


CTCCGGTGAT 


CGCTCACAAC 


GCTCCCGATG 


TGCCGGTGCT 


GAACTACTTG 


CGCGGCGACC 


1860 


CGGGCAACGG 


GGAGGGCATG 


TTGGTGTCCG 


GTCTGGTGTT 


GGCGGTGATG 


GTCGTTCCCA 


1920 


TTATCGCCAC 


CACCACTCAT 


GACCTGTTCC 


GGCAGGTGCC 


GGTGTTGCCC 


CGGGAGGGCG 


1980 


CGATCGGGAA 


TTC 










1993 


(2) INFORMATION FOR SEQ ID NO: 150: 









(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 374 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 150: 

Met Lys lie Arg Leu His Thr Leu Leu Ala Val Leu Thr Ala Ala Pro 
15 10 15 

Leu Leu Leu Ala Ala Ala Gly Cys Gly Ser Lys Pro Pro Ser Gly Ser 
20 25 30 

Pro Glu Thr Gly Ala Gly Ala Gly Thr Val Ala Thr Thr Pro Ala Ser 
35 40 45 

Ser Pro Val Thr Leu Ala Glu Thr Gly Ser Thr Leu Leu Tyr Pro Leu 
50 55 60 ? 

Phe Asn Leu Trp Gly Pro Ala Phe His Glu Arg Tyr Pro Asn Val Thr 
65 70 75 ' 80 

He Thr Ala Gin Gly Thr Gly Ser Gly Ala Gly He Ala Gin Ala Ala 
85 90 95 

Ala Gly Thr Val Asn He Gly Ala Ser Asp Ala Tyr Leu Ser Glu Gly 
100 105 no 

Asp Met Ala Ala His Lys Gly Leu Met Asn He Ala Leu Ala He Ser 
115 120 125 

Ala Gin Gin Val Asn Tyr Asn Leu Pro Gly Val Ser Glu His Leu Lys 
130 135 140 

Leu Asn Gly Lys Val Leu Ala Ala Met Tyr Gin Gly Thr He Lys Thr 
145 150 155 160 

Trp Asp Asp Pro Gin He Ala Ala Leu Asn Pro Gly Val Asn Leu Pro 
165 170 175 

Gly Thr Ala Val Val Pro Leu His Arg Ser Asp Gly Ser Gly Asp Thr 
180 185 190 

Phe Leu Phe Thr Gin Tyr Leu Ser Lys Gin Asp Pro Glu Gly Trp Gly 
195 200 205 

Lys Ser Pro Gly Phe Gly Thr Thr Val Asp Phe Pro Ala Val Pro Gly 
210 215 . 220 

Ala Leu Gly Glu Asn Gly Asn Gly Gly Met Val Thr Gly Cys Ala Glu 
225 230 235 240 

Thr Pro Gly Cys Val Ala Tyr He Gly He Ser Phe Leu Asp Gin Ala 
245 250 ^ 255 
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Ser Gin Arg Gly Leu Gly Glu Ala Gin Leu Gly Asn Ser Ser Gly Asn 
260 265 270 

Phe Leu Leu Pro Asp Ala Gin Ser lie Gin Ala Ala Ala Ala Gly Phe 
275 * 280 285 

Ala Ser Lys Thr Pro Ala Asn Gin Ala lie Ser Met lie Asp Gly Pro 
290 295 300 

Ala Pro Asp Gly Tyr Pro lie lie Asn Tyr Glu Tyr Ala lie Val Asn 
305 310 315 320 

Asn Arg Gin Lys Asp Ala Ala Thr Ala Gin Thr Leu Gin Ala Phe Leu 
325 330 335 

His Tro Ala lie Thr Asp Gly Asn Lys Ala Ser Phe Leu Asp Gin Val 
340 345 350 / 

His Phe Gin Pro Leu Pro Pro Ala Val Val Lys Leu Ser Asp Ala Leu 
355 360 365 

lie Ala Thr lie Ser Ser 
370 

(2) INFORMATION FOR SEQ ID NO: 151: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1777 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 151: 

GGTCTTGACC ACCACCTGGG TGTCGAAGTC GGTGCCCGGA TTGAAGTCCA GGTACTCGTG 60 

GGTGGGGCGG GCGAAACAAT AGCGACAAGC ATGCGAGCAG CCGCGGTAGC CGTTGACGGT 120 

GTAGCGAAAC GGCAACGCGG CCGCGTTGGG CACCTTGTTC AGCGCTGATT T G C AC AAC AC 180 

CTCGTGGAAG GTGATGCCGT CGAATTGTGG CGCGCGAACG CTGCGGACCA GGCCGATCCG 24 0 

CTGCAACCCG GCAGCGCCCG TCGTCAACGG GCATCCCGTT CACCGCGACG GCTTGCCGGG 300 

CCCAACGCAT ACCATTATTC G AAC AAC C G T TCTATACTTT GTCAACGCTG GCCGCTACCG 3 60 

AGCGCCGCAC AGGATGTGAT ATGCCATCTC TGCCCGCACA GACAGGAGCC AGGCCTTATG 420 

ACAGCATTCG GCGTCGAGCC CTACGGGCAG CCGAAGTACC TAGAAATCGC CGG GAAGCGC 4 80 

ATGGCGTATA TCGACGAAGG CAAGGGTGAC GCCATCGTCT TTCAGCACGG CAACCCCACG 54 0 
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TCGTCTTACT TGTGGCGCAA CATCATGCCG CACTTGGAAG 


GGCTGGGCCG 


GCTGGTGGCC 


600 


TGCGATCTGA TCGGGATGGG CGCGTCGGAC AAGCTCAGCC 


CATCGGGACC 


CGACCGCTAT 


660 


AGCTATGGCG AG CAACGAGA CTTTTTGTTC GCGCTCTGGG 


ATGCGCTCGA 


CCTCGGCGAC 


720 


CACGTGGTAC TGGTGCTGCA CGACTGGGGC TCGGCGCTCG 


GCTTCGACTG 


GGCTAACCAG 


780 


CATCGCGACC GAGTGCAGGG GATCGCGTTC ATGGAAGCGA 


TCGTCACCCC 


GATGACGTGG 


840 


GCGGACTGGC CGCCGGCCGT GCGGGGTGTG TTCCAGGGTT 


TCCGATCGCC 


TCAAGGCGAG 


900 


CCAATGGCGT TGGAGCACAA CATCTTTGTC GAACGGGTGC 


TGCCCGGGGC 


GATCCTGCGA 


960 


CAGCTCAGCG ACGAGGAAAT GAACCACTAT CGGCGGCCAT 


TCGTGAACGG 


CGGCGAGGAC 


1020 


CGTCGCCCCA CGTTGTCGTG GCCACGAAAC CTTCCAATCG 


ACGGTGAGCC 


CGCCGAGGTC 


1080 


GTCGCGTTGG TCAACGAGTA CCGGAGCTGG CTCGAGGAAA 


CCGACATGCC 


GAAACTGTTC 


1140 


ATCAACGCCG AGCCCGGCGC GATCATCACC GGCCGCATCC 


GTGACTATGT 


CAGGAGCTGG 


1200 


CCCAACCAGA CCGAAATCAC AGTGCCCGGC GTGCATTTCG 


TTCAGGAGGA 


CAGCGATGGC 


1260 


GTCGTATCGT GGGCGGGCGC TCGGCAGCAT CGGCGACCTG 


GGAGCGCTCT 


CATTTCACGA 


1320 


GACCAAGAAT GTGATTTCCG GCGAAGGCGG CGCCCTGCTT 


GTCAACTCAT 


AAGACTTCCT 


1380 


GCTCCGGGCA GAGATTCTCA GGGAAAAGGG CACCAATCGC 


AGCCGCTTCC 


TTCGCAACGA 


1440 


GGTCGACAAA TATACGTGGC AGGACAAAGG TCTTCCTATT 


TGCCCAGCGA 


ATTAGTCGCT 


1500 


GCCTTTCTAT GGGCTCAGTT CGAGGAAGCC GAGCGGATCA 


CGCGTATCCC; 




1560 


TGGAACCGGT ATCATGAAAG CTTCGAATCA TTGGAACAGC 


GGGGGCTCCT 


GCGCCGTCCG 


1620 


ATCATCCCAC AGGGCTGCTC TCACAACGCC CACATGTACT 


ACGTGTTACT 


AGCGCCCAGC 


1680 


GCCGATCGGG AGGAGGTGCT GGCGCGTCTG ACGAGCGAAG . 


GTATAGGCGC 


GGTCTTTCAT 


1740 


TACGTGCCGC TTCACGATTC GCCGGCCGGG CGTCGCT 






1777 


(2) INFORMATION FOR SEQ ID NO: 152: 









(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 324 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 152: 
GAGATTGAAT CGTACCGGTC TCCTTAGCGG CTCCGTCCCG TGAATGCCCA TATCACGCAC 
GGCCATGTTC TGGCTGTCGA CCTTCGCCCC ATGCCCGGAC GTTGGTAAAC CCAGGGTTTG 
ATCAGTAATT CCGGGGGACG GTTGCGGGAA GGCGGCCAGG ATGTGCGTGA GCCGCGGCGC 
CGCCGTCGCC CAGGCGACCG CTGGATGCTC AGCCCCGGTG CGGCGACGTA GCCAGCGTTT 
GGCGCGTGTC GTCCACAGTG GTACTCCGGT GACGACGCGG CGCGGTGCCT GGGTGAAGAC 
CGTGACCGAC GCCGCCGATT CAGA 
(2) INFORMATION FOR SEQ ID NO: 153: 

(i) SEQUENCE CHARACTERISTICS: / 

(A) LENGTH: 1338 base pairs / 

(B) TYPE: nucleic acid / 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 153: 



GCGGTACCGC 


CGCGTTGCGC 


TGGCACGGGA 


CCTGTACGAC 


CTGAACCACT 


TCGCCTCGCG 


AACGATTGAC 


GAACCGCTCG 


TGCGGCGGCT 


GTGGGTGCTC 


AAGGTGTGGG 


GTGATGTCGT 


CGATGACCGG 


CGCGGCACCC 


GGCCACTACG 


CGTCGAAGAC 


GTCCTCGCCG 


CCCGCAGCGA 


GCACGACTTC 


CAGCCCGACT 


CGATCGGCGT 


GCTGACCCGT 


CCTGTCGCTA 


TGGCTGCCTG 


GGAAGCTCGC 


GTTCGGAAGC 


GATTTGCGTT 


CCTCACTGAC 


CTCGACGCCG 


ACGAGCAGCG 


GTGGGCCGCC 


TGCGACGAAC 


GGCACCGCCG 


CGAAGTGGAG 


AACGCGCTGG 


CGGTGCTGCG 


GTCCTGATCA 


ACCTGCCGGC 


GATCGTGCCG 


TTCCGCTGGC 


ACGGTTGCGG 


CTGGACGCGG 


CTGAATCGAC 


TAGATGAGAG 


CAGTTGGGCA 


CGAATCCGGC 


TGTGGTGGTG 


AGCAAGACAC 


GAGTACTGTC 


ATCACTATTG 


GATGCACTGG 


ATGACCGGCC 


TGATTCAGCA 


GGACCAATGG 


AACTGCCCGG 


GGCAAAACGT 


CTCGGAGATG 


ATCGGCGTCC 


CCTCGGAACC 


CTGCGGTGCT 


GGCGTCATTC 


GGACATCGGT 


CCGGCTCGCG 


GGATCGTGGT 


GACGCCAGCG 


CTGAAGGAGT 


GGAGCGCGGC 


GGTGCACGCG 


CTGCTGGACG 


GCCG GCAGAC 


GGTGCTGCTG 


CGTAAGGGCG 


GGATCGGCGA 


GAAGCGCTTC 


GAGGTGGCGG 


CCCACGAGTT 


CTTGTTGTTC 


CCGACGGTCG 


CGCACAGCCA 


CGCCGAGCGG 


GTTCGCCCCG 


AGCACCGCGA 


CCTGCTGGGC 


CCGGCGGCCG 



60 
120 
180 
240 
300 
324 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
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CCGACAGCAC CGACGAGTGT GTGCTACTGC GGGCCGCAGC GAAAGTTGTT GCCGCACTGC 900 

CGGTTAACCG GCCAGAGGGT CTGGACGCCA TCGAGGATCT GCACATCTGG ACCGCCGAGT 960 

CGGTGCGCGC CGACCGGCTC GACTTTCGGC CCAAGCACAA ACTGGCCGTC TTGGTGGTCT 1020 

CGGCGATCCC GCTGGCCGAG CCGGTCCGGC TGGCGCGTAG GCCCGAGTAC GGCGGTTGCA 1080 

CCAGCTGGGT GCAGCTGCCG GTGACGCCGA CGTTGGCGGC GCCGGTGCAC GACGAGGCCG 1140 

CGCTGGCCGA GGTCGCCGCC CGGGTCCGCG AGGCCGTGGG TTGACTGGGC GGCATCGCTT 1200 

GGGTCTGAGC TGTACGCCCA GTCGGCGCTG CGAGTGATCT GCTGTCGGTT CGGTCCCTGC 1260 

TGGCGTCAAT TGACGGCGCG GGCAACAGCA GCATTGGCGG CGCCATCCTC CGCGCGGCCG 1320 
GCGCCCACCG CTACAACC 

(2) INFORMATION FOR SEQ ID NO: 154: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 321 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



1338 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 154: 
CCGGCGGCAC CGGCGGCACC GGCGGTACCG GCGGCAACGG CGCTGACGCC GCTGCTGTGG 
TGGGCTTCGG CGCGAACGGC GACCCTGGCT TCGCTGGCGG CAAAGGCGGT AACGGCGGAA 
TAGGTGGGGC CGCGGTGACA GGCGGGGTCG CCGGCGACGG CGGCACCGGC GGCAAAGGTG 
GCACCGGCGG TGCCGGCGGC GCCGGCAACG ACGCCGGCAG CACCGGCAAT CCCGGCGGTA 
AGGGCGGCGA CGGCGGGATC GGCGGTGCCG GCGGGGCCGG CGGCGCGGCC GGCACCGGCA 
ACGGCGGCCA TGCCGGCAAC C 
(2) INFORMATION FOR SEQ ID NO: 155: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 92 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



3NSDOCID: <WO 9816645A2_I_> 



WO 98/16645 



PCT/US97/18214 



173 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 155: 

GAAGACCCGG CCCCGCCATA TCGATCGGCT CGCCGACTAC TTTCGCCGAA CGTGCACGCG 60 

GCGGCGTCGG GCTGATCATC ACCGGTGGCT ACGCGCCCAA CCGCACCGGA TGGCTGCTGC 120 

CGTTCGCCTC CGAACTCGTC ACTTCGGCGC AAGCCCGACG GCACCGCCGA ATCACCAGGG 180 

CGGTCCACGA TTCGGGTGCA AAGATCCTGC TGCAAATCCT GCACGCCGGA CGCTACGCCT 24 0 

ACCACCCACT TGCGGTCAGC GCCTCGCCGA TCAAGGCGCC GATCACCCCG TTTCGTCCGC 300 

GAG C ACT AT C GGCTCGCGGG GTCGAAGCGA CCATCGCGGA TTTCGCCCGC TGCGCGCAGT 360 

TGGCCCGCGA TGCCGGCTAC GACGGCGTCG AAATCATGGG CAGCGAAGGG TATCTGCTCA 4 20 

ATCAGTTCCT GGCGCCGCGC ACCAACAAGC GCACCGACTC GTGGGGCGGC ACACCGGCCA 4 80 

ACCGTCGCCG GT 492 
(2) INFORMATION FOR SEQ ID NO: 156: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 536 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 156: 

Phe Ala Gin His Leu Val Glu Gly Asp Ala Val Glu Leu Trp Arg Ala 
1 5 10 15 

Asn Ala Ala Asp Gin Ala Asp Pro Leu Gin Pro Gly Ser Ala Arg Arg 
20 25 30 

Gin Arg Ala Ser Arg Ser Pro Arg Arg Leu Ala Gly Pro Asn Ala Tyr 
35 40 45 

His Tyr Ser Asn Asn Arg Ser He Leu Cys Gin Arg Trp Pro Leu Pro 

50 55 60 

Ser Ala Ala Gin Asp Val He Cys His Leu Cys Pro His Arg Gin Glu 
65 70 75 80 

Pro Gly Leu Met Thr Ala Phe Gly Val Glu Pro Tyr Gly Gin Pro Lys 
85 90 95 

Tyr Leu Glu He Ala Gly Lys Arg Met Ala Tyr He Asp Glu Gly Lys 
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100 105 110 

Gly Asp Ala lie Val Phe Gin His Gly Asn Pro Thr Ser Ser Tyr Leu 
115 120 125 

Trp Arg Asn lie Met Pro His Leu Glu Gly Leu Gly Arg Leu Val Ala 
130 135 140 

Cys Asp Leu He Gly Met Gly Ala Ser Asp Lys Leu Ser Pro Ser Gly 
145 150 155 160 

Pro Asp Arg Tyr Ser Tyr Gly Glu Gin Arg Asp Phe Leu Phe Ala Leu 
165 170 175 

Trp Asp Ala Leu Asp Leu Gly Asp His Val Val Leu Val Leu His Asp 
180 185 190 

Trp Gly Ser Ala Leu Gly Phe Asp Trp Ala Asn Gin His Arg Asp Arg 
195 200 205 I 

Val Gin Gly He Ala Phe Met Glu Ala He Val Thr Pro Met Thr Trp 
210 215 220 

Ala Asp Trp Pro Pro Ala Val Arg Gly Val Phe Gin Gly Phe Arg Ser 
225 230 235 240 

Pro Gin Gly Glu Pro Met Ala Leu Glu His Asn He Phe Val Glu Arg 
245 250 255 

Val Leu Pro Gly Ala He Leu Arg Gin Leu Ser Asp Glu Glu Met Asn 
260 265 270 

His Tyr Arg Arg Pro Phe Val Asn Gly Gly Glu Asp Arg Arg Pro Thr 
275 280 285 

Leu Ser Trp Pro Arg Asn Leu Pro He Asp Gly Glu Pro Ala Glu Val 
2 90 2 95 300 

Val Ala Leu Val Asn Glu Tyr Arg Ser Trp Leu Glu Glu Thr Asp Met 
305 310 315 320 

Pro Lys Leu Phe He Asn Ala Glu Pro Gly Ala He He Thr Gly Arg 
325 330 335 

He Arg Asp Tyr Val Arg Ser Trp Pro Asn Gin Thr Glu He Thr Val 
340 345 350 

Pro Gly Val His Phe Val Gin Glu Asp Ser Asp Gly Val Val Ser Trp 
355 360 365 

Ala Gly Ala Arg Gin His Arg Arg Pro Gly Ser Ala Leu He Ser Arg 
370 375 380 

Asp Gin Glu Cys Asp Phe Arg Arg Arg Arg Arg Pro Ala Cys Gin Leu 
385 " 390 395 400 
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lie Arg Leu Pro Ala Pro Gly Arg Asp Ser Gin Gly Lys Gly His Gin 
405 410 415 

Ser Gin Pro Leu Pro Ser Gin Arg Gly Arg Gin lie Tyr Val Ala Gly 
420 425 430 

Gin Arg Ser Ser Tyr Leu Pro Ser Glu Leu Val Ala Ala Phe Leu Trp 
435 440 445 

Ala Gin Phe Glu Glu Ala Glu Arg lie Thr Arg lie Arg Leu Asp Leu 
450 455 460 

Trp Asn Arg Tyr His Glu Ser Phe Glu Ser Leu Glu Gin Arg Gly Leu 
465 470 475 480 

Leu Arg Arg Pro lie lie Pro Gin Gly Cys Ser His Asn Ala His Met 
485 490 495 

/ 

Tyr Tyr Val Leu Leu Ala Pro Ser Ala Asp Arg Glu Glu Val Leu Ala 
500 505 510 

Arg Leu Thr Ser Glu Gly lie Gly Ala Val Phe His Tyr Val Pro Leu 
515 520 525 

His Asp Ser Pro Ala Gly Arg Arg 
530 535 

(2) INFORMATION FOR SEQ ID NO: 157 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 284 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ I 

Asn Glu Ser Ala Pro Arg Ser Pro 
1 5 

Tyr Asp Ala lie Ala Val Leu Leu 
20 

Asp Phe Gly Leu Val Gly Pro Ala 
35 40 

Gly Asp Asp Arg Ala Gly Leu Gly 
50 55 

Gly Phe Leu Glu Pro Ala Pro Val 



) NO:157: 

Met Leu Pro Ser Ala Arg Pro Arg 
10 15 

Asn Glu Met His Ala Gly His Cys 
25 30 

Pro Asp lie Val Thr Asp Ala Ala 
45 

Val Asp Glu Gin Phe Arg His Val 
60 

Leu Val Asp Gin Arg Asp Asp Leu 
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65 70 75 



80 



Gly Gly Leu Thr Val Asp Trp Lys Val Ser Trp Pro Arg Gin Arg Gly 
85 90 95 

Ala Thr Val Leu Ala Ala Val His Glu Trp Pro Pro He Val Val His 
100 105 no 

Phe Leu Val Ala Glu Leu Ser Gin Asp Arg Pro Gly Gin His Pro Phe 
115 120 125 

Asp Lys Asp Val Val Leu Gin Arg His Trp Leu Ala Leu Arg Arg Ser 
130 135 140 

Glu Thr Leu Glu His Thr Pro His Gly Arg Arg Pro Val Arg Pro Arg 
145 150 155 160 

His Arg Gly Asp Asp Arg Phe His Glu Arg Asp Pro Leu His Ser Val 
165 170 175 

Ala Met Leu Val Ser Pro Val Glu Ala Glu Arg Arg Ala Pro Val Val 
180 185 190 

Gin His Gin Tyr His Val Val Ala Glu Val Glu Arg He Pro Glu Arg 
195 200 205 

Glu Gin Lys Val Ser Leu Leu Ala He Ala He Ala Val Gly Ser Arg 
210 215 220 

Trp Ala Glu Leu Val Arg Arg Ala His Pro Asp Gin He Ala Gly His 
225 230 235 240 

Gin Pro Ala Gin Pro Phe Gin Val Arg His Asp Val Ala Pro Gin Val 
245 250 255 

Arg Arg Arg Gly Val Ala Val Leu Lys Asp Asp Gly Val Thr Leu Ala 
260 265 270 

Phe Val Asp He Arg His Ala Leu Pro Gly Asp Phe 
275 280 

(2) INFORMATION FOR SEQ ID NO: 158: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 264 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
<D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 158: 
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ATGAACATGT CGTCGGTGGT GGGTCGCAAG GCCTTTGCGC GATTCGCCGG CTACTCCTCC 60 

GCCATGCACG CGATCGCCGG TTTCTCCGAT GCGTTGCGCC AAGAGCTGCG GGGTAGCGGA 120 

ATCGCCGTCT CGGTGATCCA CCCGGCGCTG ACCCAGACAC CGCTGTTGGC CAACGTCGAC 180 

CCCGCCGACA TGCCGCCGCC GTTTCGCAGC CTCACGCCCA TTCCCGTTCA CTGGGTCGCG 240 

GCAGCGGTGC TTGACGGTGT GGCG 264 
(2) INFORMATION FOR SEQ ID NO: 159: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 1171 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 159: 

TAGTCGGCGA CGATGACGTC GCGGTCCAGG CCGACCGCTT CAAGCACCAG CGCGACCACG 60 

AAGCCGGTGC GATCCTTACC CGCGAAGCAG TGGGTGAGCA CCGGGCGTCC GGCGGCAAGC 120 

AGTGTGACGA CACGATGTAG CGCGCGCTGT GCTCCATTGC GCGTTGGGAA TT GGCG AT AC 180 

TCGTCGGTCA TGTAGCGGGT GGCCGCGTCA TTTATCGACT GGCTGGATTC GCCGGACTCG 24 0 

CCGTTGGACC CGTCATTGGT TAGCAGCCTC TTGAATGCGG TTTCGTGCGG CGCTGAGTCG 300 

TCGGCGTCAT CATCGGCGAG GTCGGGGAAC GGCAGCAGGT GGACGTCGAT GCCGTCCGGA 360 

ACCCGTCCTG GACCGCGGCG GGCAACCTCC CGGGACG AC C GCAGGTCGGC AACGTCGGTG 4 20 

ATCCCCAGCC GGCGCAGCGT TGCCCCTCGT GCCGAATTCG GCACGAGGCT GGCGAGCCAC 4 80 

CGGGCATCAC CAAGCAACGC TTGCCCAGTA CGGATCGTCA CTTCCGCATC CGGCAGACCA 54 0 

ATCTCCTCGC CGCCCATCGT CAGATCCCGC TCGTGCGTTG ACAAGAACGG CCGCAGATGT 600 

GCCAGCGGGT ATCGGAGATT GAACCGCGCA CGCAGTTCTT CAATCGCTGC GCGCTGCCGC 660 

ACTATTGGCA CTTTCCGGCG GTCGCGGTAT TCAGCAAGCA TGCGAGTCTC GACGAACTCG 720 

CCCCACGTAA CCCACGGCGT AGCTCCCGGC GTGACGCGGA GGATCGGCGG GTGATCTTTG 780 

CCGCCACGCT CGTAGCCGTT GATCCACCGC TTCGCGGTGC CGGCGGGGAG GCCGATCAGC 84 0 

TTATCGACCT CGGCGTATGC CGACGGCAAG CTGGGCGCGT TCGTCGAGGT CAAGAACTCC 900 

ACCATCGGCA CCGGCACCAA GGTGCCGCAC CTGACCTACG TCGGCGACGC CGACATCGGC 960 
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GAG T AC AG C A ACATCGGCGC CTCCAGCGTG TTCGTCAACT ACGACGGTAC GTCCAAACGG 1020 
CGCACCACCG TCGGTTCGCA CGTACGGACC GGGTCCGACA CCATGTTCGT GGCCCCAGTA 1080 
ACCATCGGCG ACGGCGCGTA TACCGGGGCC GGCACAGTGG TGCGGGAGGA TGTCCCGCCG 114 0 



GGGGCGCTGG CAGTGTCGGC GGGTCCGCAA C 
(2) INFORMATION FOR SEQ ID NO: 1 60 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 227 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



1171 



/• 

/ 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 160: 
GCAAAGGCGG CACCGGCGGG GCCGGCATGA ACAGCCTCGA CCCGCTGCTA GCCGCCCAAG 60 
ACGGCGGCCA AGGCGGCACC GGCGGCACCG GCGGCAACGC CGGCGCCGGC GGCACCAGCT 120 
TCACCCAAGG CGCCGACGGC AACGCCGGCA ACGGCGGTGA CGGCGGGGTC GGCGGCAACG 180 
GCGGAAACGG CGGAAACGGC GCAGACAACA CCACCACCGC CGCCGCC 227 
(2) INFORMATION FOR SEQ ID NO: 161: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 304 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 161: 

CCTCGCCACC ATGGGCGGGC AGGGCGGTAG CGGTGGCGCC GGCTCTACCC CAGGCGCCAA 60 

GGGCGCCCAC GGCTTCACTC CAACCAGCGG CGGCGACGGC GGCGACGGCG GCAACGGCGG 12 0 

CAACTCCCAA GTGGTCGGCG GCAACGGCGG CGACGGCGGC AATGGCGGCA ACGGCGGCAG 18 0 

CGCCGGCACG GGCGGCAACG GCGGCCGCGG CGGCGACGGC GCGTTTGGTG GCATGAGTGC 24 0 

CAACGCCACC AACCCTGGTG AAAACGGGCC AAACGGTAAC CCCGGCGGCA ACGGTGGCGC 300 
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CGGC 304 
(2) INFORMATION FOR SEQ ID NO: 162: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1439 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 162: 
GTGGGACGCT GCCGAGGCTG TATAACAAGG AC AAC AT CGA CCAGCGCCGG CTCGGTGAGC 60 

TGATCGACCT ATT T AAC AG T GCGCGCTTCA GCCGGCAGGG CGAGCACCGC GCCCGGGATC 120 

TGATGGGTGA GGTCTACGAA TACTTCCTCG GCAATTTCGC TCGCGCGGAA GGGAAGCGGG 180 

GTGGCGAGTT CTTTACCCCG CCCAGCGTGG TCAAGGTGAT CGTGGAGGTG CTGGAGCCGT 240 

CGAGTGGGCG GGTGTATGAC CCGTGCTGCG GTTCCGGAGG CATGTTTGTG CAGACCGAGA 300 

AGTTCATCTA CGAACACGAC GGCGATCCGA AGGATGTCTC GATCTATGGC CAGGAAAGCA 360 

TTGAGGAGAC CTGGCGGATG GCGAAGATGA ACCTCGCCAT CCACGGCATC GACAACAAGG 4 20 

GGCTCGGCGC CCGATGGAGT GATACCTTCG CCCGCGACCA GCACCCGGAC GTGCAGATGG 480 

ACTACGTGAT GGCCAATCCG CCGTTCAACA TCAAAGACTG GGCCCGCAAC GAGGAAGACC 540 
CACGCTGGCG CTTCGGTGTT CCGCCCGCCA ATAACGCCAA CTACGCATGG ATTCAGCACA 600 
TCCTGTACAA CTTGGCGCCG GGAGGTCGGG CGGGCGTGGT GATGGCCAAC GGGTCGATGT 660 
CGTCGAACTC CAACGGCAAG GGGGATATTC GCGCGCAAAT CGTGGAGGCG GATTTGGTTT 720 
CCTGCATGGT CGCGTTACCC ACCCAGCTGT TCCGCAGCAC CGGAATCCCG GTGTGCCTGT 780 
GGTTTTTCGC CAAAAACAAG GCGGCAGGTA AGCAAGGGTC TATCAACCGG TGCGGGCAGG 840 
TGCTGTTCAT CGACGCTCGT GAACTGGGCG ACCTAGTGGA CCGGGCCGAG CGGGCGCTGA 900 
CCAACGAGGA GATCGTCCGC ATCGGGGATA CCTTCCACGC GAGCACGACC ACCGGCAACG 960 

CCGGCTCCGG TGGTGCCGGC GGTAATGGGG GCACTGGCCT CAACGGCGCG GGCGGTGCTG 1020 

GCGGGGCCGG CGGCAACGCG GGTGTCGCCG GCGTGTCCTT CGGCAACGCT GTGGGCGGCG 1080 

ACGGCGGCAA CGGCGGCAAC GGCGGCCACG GCGGCGACGG CACGACGGGC GGCGCCGGCG 114 0 

GCAAGGGCGG CAACGGCAGC AGCGGTGCCG CCAGCGGCTC AGGCGTCGTC AACGTCACCG 1200 
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CCGGCCACGG CGGCAACGGC GGCAATGGCG GCAACGGCGG CAACGGCTCC GCGGGCGCCG 
GCGGCCAGGG CGGTGCCGGC GGCAGCGCCG GCAACGGCGG CCACGGCGGC GGTGCCACCG 
GCGGCGCCAG CGGCAACGGC GGCAACGGCA CCAGCGGTGC CGCCAGCGGC TCAGGCGTCA 
TCAACGTCAC CGCCGGCCAC GGCGGCAACG GCGGCAATGG CGGCAACGGC GGCAACGGC 
(2) INFORMATION FOR SEQ ID NO: 163: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 329 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



1260 
1320 
1380 
1439 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 163: 



GGGCCGGCGG 


GGCCGGATTT TCTCGTGCCT 


TGATTGTCGC 


TGGGGATAAC 


GGCGGTGATG 


60 


GTGGTAACGG 


CGGGATGGGC GGGGCTGGCG 


GGGCTGGCGG 


CCCCGGCGGG 


GCCGGCGGCC 


120 


TGATCAGCCT 


GCTGGGCGGC CAAGGCGCCG 


GCGGGGCCGG 


CGGGACCGGC 


GGGGCCGGCG 


180 


GTGTTGGCGG 


TGACGGCGGG GCCGGCGGCC 


CCGGCAACCA 


GGCCTTCAAC 


GCAGGTGCCG 


240 


GCGGGGCCGG 


CGGCCTGATC AGCCTGCTGG 


GCGGCCAAGG 


CGCCGGCGGG 


GCCGGCGGGA 


300 


CCGGCGGGGC 


CGGCGGTGTT GGCGGTGAC 








329 


(2) INFORMATION FOR SEQ ID NO: 164 









(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 80 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 164: 
GCAACGGTGG CAACGGCGGC ACCAGCACGA CCGTGGGGAT GGCCGGAGGT AACTGTGGTG 60 
CCGCCGGGCT GATCGGCAAC 

80 



(2) INFORMATION FOR SEQ ID NO: 165 ; 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 392 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 165: 



GGGCTGTGTC 


GCACTCACAC 


CGCCGCATTC 


GGCGACGTTG 


GCCGCCCAAT 


ATCCAGCTCA 


60 


AGGCCTACTA 


CTTACCGTCG 


GAGGACCGCC 


GCATCAAGGT 


GCGGGTCAGC 


GCCCAAGGAA 


120 


TCAAGGTCAT 


CGACCGCGAC 


GGGCATCGAG 


GCCGTCGTCG 


CGCGGCTCGG 


GCAGGATCCG 


180 


CCCCGGCGCA 


CTTCGCGCGC 


CAAGCGGGCT 


CATCGCTCCG 


AACGGCGGCG 


ATCCTGTGAG 


240 


CACAACTGAT 


GGCGCGCAAC 


GAGATTCGTC 


CAATTGTCAA 


GCCGTGTTCG 


ACCGCAGGGA 


300 


CCGGTTATAC 


GTATGTCAAC 


CTATGTCACT 


CGCAAGAACC 


GGCATAACGA 


TCCCGTGATC 


360 


CGCCGACAGC 


CCACGAGTGC 


AAGACCGTTA 


CA 






392 



(2) INFORMATION FOR SEQ ID NO: 166: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 535 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 166: 

ACCGGCGCCA CCGGCGGCAC CGGGTTCGCC GGTGGCGCCG GCGGGGCCGG CGGGCAGGGC 60 

GGTATCAGCG GTGCCGGCGG CACCAACGGC TCTGGTGGCG CTGGCGGCAC CGGCGGACAA 120 

GGCGGCGCCG GGGGCGCTGG CGGGGCCGGC GCCGATAACC CCACCGGCAT CGGCGGCGCC 180 

GGCGGCACCG GCGGCACCGG CGGAGCGGCC GGAGCCGGCG GGGCCGGTGG CGCCATCGGT 24 0 

ACCGGCGGCA CCGGCGGCGC GGTGGGCAGC GTCGGTAACG CCGGGATCGG CGGTACCGGC 300 

GGTACGGGTG GTGTCGGTGG TGCTGGTGGT GCAGGTGCGG CTGCGGCCGC TGGCAGCAGC 360 

GCTACCGGTG GCGCCGGGTT CGCCGGCGGC GCCGGCGGAG AAGGCGGACC GGGCGGCAAC 4 20 

AGCGGTGTGG GCGGCACCAA CGGCTCCGGC GGCGCCGGCG GTGCAGGCGG CAAGGGCGGC 4 80 
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ACCGGAGGTG CCGGCGGGTC CGGCGCGGAC AACCCCACCG GTGCTGGTTT CGCCG 
(2) INFORMATION FOR SEQ ID NO: 167: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 690 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 167 








CCGACGTCGC 


CGGGGCGATA 


CGGGGGTCAC 


CGACTACTAC 


ATCATCCGCA 


CCGAGAATCG 


60 


GCCGCTGCTG 


CAACCGCTGC 


GGGCGGTGCC 


GGTCATCGGA 


GATCCGCTGG 


CCGACCTGAT 


120 


CCAGCCGAAC 


CTGAAGGTGA 


TCGTCAACCT 


GGGCTACGGC 


GACCCGAACT 


ACGGCTACTC 


180 


GACGAGCTAC 


GCCGATGTGC 


GAACGCCGTT 


CGGGCTGTGG 


CCGAACGTGC 


CGCCTCAGGT 


240 


CATCGCCGAT 


GCCCTGGCCG 


CCGGAACACA 


AGAAGGCATC 


CTTGACTTCA 


CGGCCGACCT 


300 


GCAGGCGCTG 


TCCGCGCAAC 


CGCTCACGCT 


CCCGCAGATC 


CAGCTGCCGC 


AACCCGCCGA 


360 


TCTGGTGGCC 


GCGGTGGCCG 


CCGCACCGAC 


GCCGGCCGAG 


GTGGTGAACA 


CGCTCGCCAG 


420 


GATCATCTCA 


ACCAACTACG 


CCGTCCTGCT 


GCCCACCGTG 


GACATCGCCC 


TCGCCTGGTC 


480 


ACCACCCTGC 


CGCTGTACAC 


CACCCAACTG 


TTCGTCAGGC 


AACTCGCTGC 


GGGCAATCTG 


540 


ATCAACGCGA 


TCGGCTATCC 


CCTGGCGGCC 


ACCGTAGGTT 


TAGGCACGAT 


CGATAGCGGG 


600 


CGGCGTGGAA 


TTGCTCACCC 


TCCTCGCGGC 


GGCCTCGGAC 


ACCGTTCGAA 


ACATCGAGGG 


660 


CCTCGTCACC 


TAACGGATTC 


CCGACGGCAT 








690 



(2) INFORMATION FOR SEQ ID NO: 168: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 407 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 168: 
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ACGGTGACGG 


CGGTACTGGC 


GGCGGCCACG 


GCGGCAACGG 


CGGGAATCCC 


GGGTGGCTCT 


60 


TGGGCACAGC 


CGGGGGTGGC 


GGCAACGGTG 


GCGCCGGCAG 


CACCGGTACT 


GCAGGTGGCG 


120 


GCTCTGGGGG 


CACCGGCGGC 


GACGGCGGGA 


CCGGCGGGCG 


TGGCGGCCTG 


TTAATGGGCG 


180 


CCGGCGCCGG 


CGGGCACGGT 


GGCACTGGCG 


GCGCGGGCGG 


TGCCGGTGTC 


GACGGTGGCG 


240 


GCGCCGGCGG 


GGCCGGCGGG 


GCCGGCGGCA 


ACGGCGGCGC 


CGGGGGTCAA 


GCCGCCCTGC 


300 


TGTTCGGGCG 


CGGCGGCACC 


GGCGGAGCCG 


GCGGCTACGG 


CGGCGATGGC 


GGTGGCGGCG 


360 


GTGACGGCTT 


CGACGGCACG 


ATGGCCGGCC 


TGGGTGGTAC 


CGGTGGC 




407 



(2) INFORMATION FOR SEQ ID NO: 169: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 68 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 169: 



GATCGGTCAG 


CGCATCGCCC 


TCGGCGGCAA 


GCGATTCCGC 


GGTCTCACCG 


AAGAACATCG 


60 


TGCACGCGGC 


GGCGCGGACC 


AGCCCGCTGC 


GCTGCGGCGC 


GTCGAACGCC 


TCCAGCAGGC 


120 


ACAGCCAGTC 


CTTGGCGGCC 


TGCGAGGCGA 


ACACGTCGGT 


GTCACCGGTG 


TAGATCGCCG 


180 


GGATGCCCGC 


CTCCGCCAAC 


GCATTCCGGC 


ACGCCCGCGC 


GTCTTTGTGA 


TGCTCGACGA 


240 


TCACCGCGAT 


GTCTGCGGCC 


ACCACGGGCC 


GCCCGGCGAA 


GGTGGCCCCG 


CTGGCCAGTA 


300 


GCGCCGCGAC 


GTCGGCGGCC 


AGGTCGTCGG 


GGATGTGCCG 


GCGCAGCGCT 


CCGGCGCGAC 


360 


GCCCGAAAAA 


CGACCCCTCA 


CCCAGCTGGG 


TCCCGCTGGC 


ATATCCCTTG 


CCGTCCTGGG 


420 


CGATATTGGA 


CGCGCATGCC 


CCGACCGCGT 


ACAGGCCGGC 


CACCACCG 




468 



(2) INFORMATION FOR SEQ ID NO: 170: 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 219 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 170: 
GGTGGTAACG GCGGCCAGGG TGGCATCGGC GGCGCCGGCG AGAGAGGCGC CGACGGCGCC 60 
GGCCCCAATG CTAACGGCGC AAACGGCGAG AACGGCGGTA GCGGTGGTAA CGGTGGCGAC 120 
GGCGGCGCCG GCGGCAATGG CGGCGCGGGC GGCAACGCGC AGGCGGCCGG GTACACCGAC 180 
GGCGCCACGG GCACCGGCGG CGACGGCGGC AACGGCGGC 
(2) INFORMATION FOR SEQ ID NO: 171: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 94 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



/ 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 171: 

TAGCTCCGGC GAGGGCGGCA AGGGCGGCGA CGGTGGCCAC GGCGGTGACG GCGTCGGCGG 60 

CAACAGTTCC GTCACCCAAG GCGGCAGCGG CGGTGGCGGC GGCGCCGGCG GCGCCGGCGG 120 

CAGCGGCTTT TTCGGCGGCA AGGGCGGCTT CGGCGGCGAC GGCGGTCAGG GCGGCCCCAA 180 

CGGCGGCGGT ACCGTCGGCA CCGTGGCCGG TGGCGGCGGC AACGGCGGTG TCGGCGGCCG 24 0 

GGGCGGCGAC GGCGTCTTTG CCGGTGCCGG CGGCCAGGGC GGCCTCGGTG GGCAGGGCGG 300 

CAATGGCGGC GGCTCCACCG GCGGCAACGG CGGCCTTGGC GGCGCCGGCG GTGGCGGAGG 3 60 

CAACGCCCCG GCTCGTGCCG AATCCGGGCT GACCATGGAC AGCGCGGCCA AGTTCGCTGC 4 20 

CATCGCATCA GGCGCGTACT GCCCCGAACA CCTGGAACAT CACCCGAGTT AGCGGGGCGC 4 80 

ATTTCCTGAT CACC 494 
(2) INFORMATION FOR SEQ ID NO: 172: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 220 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO:172: 

GGGCCGGTGG TGCCGCGGGC CAGCTCTTCA GCGCCGGAGG CGCGGCGGGT GCCGTTGGGG 60 

TTGGCGGCAC CGGCGGCCAG GGTGGGGCTG GCGGTGCCGG AGCGGCCGGC GCCGACGCCC 120 

CCGCCAGCAC AGGTCTAACC GGTGGTACCG GGTTCGCTGG CGGGGCCGGC GGCGTCGGCG 180 

GCCAGAGCGG CAACGCCATT GCCGGCGGCA TCAACGGCTC 220 
(2) INFORMATION FOR SEQ ID NO: 173: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 388 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 173: 



ATGGCGGCAA 


CGGGGGCCCC 


GGCGGTGCTG 


GCGGGGCCGG 


CGACTACAAT 


TTCCAACGGC 


60 


GGGCAGGGTG 


GTGCCGGCGG 


CCAAGGCGGC 


CAAGGCGGCC 


TGGGCGGGGC 


AAGCACCACC 


120 


TGATCGGCCT 


AGCCGCACCC 


GGGAAAGCCG 


ATCCAACAGG 


CGACGATGCC 


GCCTTCCTTG 


180 


CCGCGTTGGA 


CCAGGCCGGC 


ATCACCTACG 


CTGACCCAGG 


CCACGCCATA 


ACGGCCGCCA 


240 


AGGCGATGTG 


TGGGCTGTGT 


GCTAACGGCG 


TAACAGGTCT 


ACAGCTGGTC 


GCGGACCTGC 


300 


GGGACTACAA 


TCCCGGGCTG 


ACCATGGACA 


GCGCGGCCAA 


GTTCGCTGCC 


ATCGCATCAG 


360 


GCGCGTACTG 


CCCCGAACAC 


CTGGAACA 








388 



(2) I N FORMAT I ON FOR SEQ ID NO: 174: 

( i ) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 4 00 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 174: 
GCAAAGGCGG CACCGGCGGG GCCGGCATGA ACAGCCTCGA CCCGCTGCTA GCCGCCCAAG 
ACGGCGGCCA AGGCGGCACC GGCGGCACCG GCGGCAACGC CGGCGCCGGC GGCACCAGCT 
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TCACCCAAGG CGCCGACGGC AACGCCGGCA ACGGCGGTGA CGGCGGGGTC GGCGGCAACG 
GCGGAAACGG CGGAAACGGC GCAGACAACA CCACCACCGC CGCCGCCGGC ACCACAGGCG 
GCGACGGCGG GGCCGGCGGG GCCGGCGGAA CCGGCGGAAC CGGCGGAGCC GCCGGCACCG 
GCACCGGCGG CCAACAAGGC AACGGCGGCA ACGGCGGCAC CGGCGGCAAA GGCGGCACCG 
GCGGCGACGG TGCACTCTCA GGCAGCACCG GTGGTGCCGG 
(2) INFORMATION FOR SEQ ID NO: 175: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 538 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 175: 



GGCAACGGCG 


GCAACGGCGG 


CATCGCCGGC 


ATTGGGCGGC 


AACGGCGTTC 


CGGGACGGGC 


60 


AGCGGCAACG 


GCGGCCAACG 


GCGGCAGCGG 


CGGCAACGGC 


GGCAACGCCG 


GCATGGGCGG 


120 


CAACAGCGGC 


ACCGGCAGCG 


GCGACGGCGG 


TGCCGGCGGG 


AACGGCGGCG 


CGGCGGGCAC 


180 


GGGCGGCACC 


GGCGGCGACG 


GCGGCCTCAC 


CGGTACTGGC 


GGCACCGGCG 


GCAGCGGTGG 


240 


CACCGGCGGT 


GACGGCGGTA 


ACGGCGGCAA 


CGGAGCAGAT 


AACACCGCAA 


ACATGACTGC 


300 


GCAGGCGGGC 


GGTGACGGTG 


GCAACGGCGG 


CGACGGTGGC 


TTCGGCGGCG 


GGGCCGGGGC 


360 


CGGCGGCGGT 


GGCTTGACCG 


CTGGCGCCAA 


CGGCACCGGC 


GGGCAAGGCG 


GCGCCGGCGG 


420 


CGATGGCGGC 


AACGGGGCCA 


TCGGCGGCCA 


CGGCCCACTC 


ACTGACGACC 


CCGGCGGCAA 


480 


CGGGGGCACC 


GGCGGCAACG 


GCGGCACCGG 


CGGCACCGGC 


GGCGCGGGCA 


TCGGCAGC 


538 


(2) INFORMATION FOR SEQ ID NO: 17 6 











(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 239 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 176: 
GGGCCGGTGG TGCCGCGGGC CAGCTCTTCA GCGCCGGAGG CGCGGCGGGT GCCGTTGGGG 60 
TTGGCGGCAC CGGCGGCCAG GGTGGGGCTG GCGGTGCCGG AGCGGCCGGC GCCGACGCCC 120 
CCGCCAGCAC AGGTCTAACC GGTGGTACCG GGTTCGCTGG CGGGGCCGGC GGCGTCGGCG 180 
GCCACGGCGG CAACGCCATT GCCGGCGGCA TCAACGGCTC CGGTGGTGCC GGCGGCACC 239 
(2) INFORMATION FOR SEQ ID NO: 177: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 985 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 177: 



AGCAGCGCTA 


CCGGTGGCGC 


CGGGTTCGCC 


GGCGGCGCCG 


GCGGAGAAGG 


CGGAGCGGGC 


60 


GGCAACAGCG 


GTGTGGGCGG 


CACCAACGGC 


TCCGGCGGCG 


CCGGCGGTGC 


AGGCGGCAAG 


120 


GGCGGCACCG 


GAGGTGCCGG 


CGGGTCCGGC 


GCGGACAACC 


CCACCGGTGC 


TGGTTTCGCC 


180 


GGTGGCGCCG 


GCGGCACAGG 


TGGCGCGGCC 


GGCGCCGGCG 


GGGCCGGCGG 


GGCGACCGGT 


240 


ACCGGCGGCA 


CCGGCGGCGT 


TGTCGGCGCC 


ACCGGTAGTG 


CAGGCATCGG 


CGGGGCCGGC 


300 


GGCCGCGGCG 


GTGACGGCGG 


CGATGGGGCC 


AGCGGTCTCG 


GCCTGGGCCT 


CTCCGGCTTT 


360 


GACGGCGGCC 


AAGGCGGCCA 


AGGCGGGGCC 


GGCGGCAGCG 


CCGGCGCCGG 


CGGCATCAAC 


420 


GGGGCCGGCG 


GGGCCGGCGG 


CAACGGCGGC 


GACGGCGGGG 


ACGGCGCAAC 


CGGTGCCGCA 


480 


GGTCTCGGCG 


ACAACGGCGG 


GGTCGGCGGT 


GACGGTGGGG 


CCGGTGGCGC 


CGCCGGCAAC 


540 


GGCGGCAACG 


CGGGCGTCGG 


CCTGACAGCC 


AAGGCCGGCG 


ACGGCGGCGC 


CGCGGGCAAT 


600 


GGCGGCAACG 


GGGGCGCCGG 


CGGTGCTGGC 


GGGGCCGGCG 


ACAACAATTT 


CAACGGCGGC 


660 


CAGGGTGGTG 


CCGGCGGCCA 


AGGCGGCCAA 


GGCGGCTTGG 


GCGGGGCAAG 


CACCACCTGA 


720 


TCGGCCTAGC 


CGCACCCGGG 


AAAGCCGATC 


CAACAGGCGA 


CGATGCCGCC 


TTCCTTGCCG 


780 


CGTTGGACCA 


GGCCGGCATC 


ACCTACGCTG 


ACCCAGGCCA 


CGCCATAACG 


GCCGCCAAGG 


840 


CGATGTGTGG 


GCTGTGTGCT 


AACGGCGTAA 


CAGGTCTACA 


GCTGGTCGCG 


GACCTGCGGG 


900 


AATACAATCC 


CGGGCTGACC 


ATGGACAGCG 


CGGCCAAGTT 


CGCTGCCATC 


GCATCAGGCG 


960 
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CGTACTGCCC CGAACACCTG GAACA 

(2) INFORMATION FOR SEQ ID NO: 17 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2138 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 178: 

CGGCACGAGG ATCGGTACCC CGCGGCATCG GCAGCTGCCG ATTCGCCGGG TTTCCCCACC 60 

CGAGGAAAGC CGCTACCAGA TGGCGCTGCC GAAGTAGGGC GATCCGTTCG CGATGCCGGC 120 

ATGAACGGGC GGCATCAAAT TAGTGCAGGA ACCTTTCAGT TTAGCGACGA TAATGGCTAT 180 

AGCACTAAGG AGGATG AT CC GATATGACGC AGTCGCAGAC CGTGACGGTG GATCAGCAAG 24 0 

AGATTTTGAA CAGGGCCAAC GAGGTGGAGG CCCCGATGGC GGACCCACCG ACTGATGTCC 300 

CCATCACACC GTGCGAACTC ACGGCGGCTA AAAACGCCGC CCAACAGCTG GTATTGTCCG 360 

CCGACAACAT GCGGGAATAC CTGGCGGCCG GTGCCAAAGA GCGGCAGCGT CTGGCGACCT 4 20 

CGCTGCGCAA CGCGGCCAAG GCGTATGGCG AGGTTGATGA GGAGGCTGCG ACCGCGCTGG 4 80 

ACAACGACGG CGAAGGAACT GTGCAGGCAG AATCGGCCGG GGCCGTCGGA GGGGACAGTT 540 

CGGCCGAACT AACCGATACG CCGAGGGTGG CCACGGCCGG TGAACCCAAC TTCATGGATC 600 

TCAAAGAAGC GGCAAGGAAG CTCGAAACGG GCGACCAAGG CGCATCGCTC GCGCACTTTG 660 

CGGATGGGTG GAACACTTTC AACCTGACGC TGCAAGGCGA CGTCAAGCGG TTCCGGGGGT 720 

TTGACAACTG GGAAGGCGAT GCGGCTACCG CTTGCGAGGC TTCGCTCGAT CAACAACGGC 7 80 

AATGGATACT CCACATGGCC AAATTGAGCG CTGCGATGGC CAAGCAGGCT CAATATGTCG 840 

CGCAGCTGCA CGTGTGGGCT AGGCGGGAAC ATCCGACTTA TGAAGACATA GTCGGGCTCG 900 

AACGGCTTTA CGCGGAAAAC CCTTCGGCCC GCGACCAAAT TCTCCCGGTG TACGCGGAGT 960 

ATC AG CAG AG GTCGGAGAAG GTGCTGACCG AATACAACAA CAAGGCAGCC CTGGAACCGG 1020 

TAAACCCGCC GAAGCCTCCC CCCGCCATCA AGATCGACCC GCCCCCGCCT CCGCAAGAGC 1080 

AGGGATTGAT CCCTGGCTTC CTGATGCCGC CGTCTGACGG CTCCGGTGTG ACTCCCGGTA 114 0 
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CCGGGATGCC 


AGCCGCACCG 


ATGGTTCCGC 


CTACCGGATC 


GCCGGGTGGT 


GGCCTCCCGG 


1200 


CTGACACGGC 


GGCGCAGCTG 


ACGTCGGCTG 


GGCGGGAAGC 


CGCAGCGCTG 


TCGGGCGACG 


1260 


TGGCGGTCAA 


AGCGGCATCG 


CTCGGTGGCG 


GTGGAGGCGG 


CGGGGTGCCG 


TCGGCGCCGT 


1320 


TGGGATCCGC 


GATCGGGGGC 


GCCGAATCGG 


TGCGGCCCGC 


TGGCGCTGGT 


GACATTGCCG 


1380 


GCTTAGGCCA 


GGGAAGGGCC 


GGCGGCGGCG 


CCGCGCTGGG 


CGGCGGTGGC 


ATGGGAATGC 


1440 


CGATGGGTGC 


CGCGCATCAG 


GGACAAGGGG 


GCGCCAAGTC 


CAAGGGTTCT 


CAGCAGGAAG 


1500 


ACGAGGCGCT 


CTACACCGAG 


GATCGGGCAT 


GGACCGAGGC 


CGTCATTGGT 


AACCGTCGGC 


1560 


GCCAGGACAG 


TAAGGAGTCG 


AAGTGAGCAT 


GGACGAATTG 


GACCCGCATG 


TCGCCCGGGC 


1620 


GTTGACGCTG 


GCGGCGCGGT 


TTCAGTCGGC 


CCTAGACGGG 


ACGCTCAATC 


AGATGAACAA 


1680 


CGGATCCTTC 


CGCGCCACCG 


ACGAAGCCGA 


GACCGTCGAA 


GTGACGATCA 


ATGGGCACCA 


1740 


GTGGCTCACC 


GGCCTGCGCA 


TCGAAGATGG 


TTTGCTGAAG 


AAGCTGGGTG 


CCGAGGCGGT 


1800 


GGCTCAGCGG 


GTCAACGAGG 


CGCTGCACAA 


TGCGCAGGCC 


GCGGCGTCCG 


CGTATAACGA 


1860 


CGCGGCGGGC 


GAGCAGCTGA 


CCGCTGCGTT 


ATCGGCCATG 


TCCCGCGCGA 


TGAACGAAGG 


1920 


AATGGCCTAA 


GCCCATTGTT 


GCGGTGGTAG 


CGACTACGCA 


CCGAATGAGC 


GCCGCAATGC 


1980 


GGTCATTCAG 


CGCGCCCGAC 


ACGGCGTGAG 


TACGCATTGT 


CAATGTTTTG 


ACATGGATCG 


2040 


GCCGGGTTCG 


GAGGGCGCCA 


TAGTCCTGGT 


CGCCAATATT 


GCCGCAGCTA 


GCTGGTCTTA 


2100 


GGTTCGGTTA 


CGCTGGTTAA 


TTATGACGTC 


CGTTACCA 






2138 


(2) INFORMATION FOR SEQ ID NO: 17 9: 









(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 60 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

( D) TOPOLOGY : linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 179: 

Met Thr Gin Ser Gin Thr Val Thr Val Asp Gin Gin Glu lie Leu Asn 
1 5 10 15 

Arg Ala Asn Glu Val Glu Ala Pro Met Ala Asp Pro Pro Thr Asp Val 
20 25 30 

Pro He Thr Pro Cys Glu Leu Thr Ala Ala Lys Asn Ala Ala Gin Gin 
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35 



40 



45 



Leu Val Leu Ser Ala Asp Asn Met Arg Glu Tyr Leu Ala Ala Gly Ala 
50 55 60 

Lys Glu Arg Gin Arg Leu Ala Thr Ser Leu Arg Asn Ala Ala Lys Ala 
65 ™ 75 80 

Tyr Gly Glu Val Asp Glu Glu Ala Ala Thr Ala Leu Asp Asn Asp Gly 
85 go 95 

Glu Gly Thr Val Gin Ala Glu Ser Ala Gly Ala Val Gly Gly Asp Ser 
100 105 . no 

Ser Ala Glu Leu Thr Asp Thr Pro Arg Val Ala Thr Ala Gly Glu Pro 
115 120 125 

Asn Phe Met Asp Leu Lys Glu Ala Ala Arg Lys Leu Glu Thr Gly' Asp 
130 135 140 ' 



Gin Gly Ala Ser Leu Ala His Phe Ala Asp Gly Trp Asn Thr Phe Asn 
145 I 50 155 i 6 o 

Leu Thr Leu Gin Gly Asp Val Lys Arg Phe Arg Gly Phe Asp Asn Trp 
165 170 175 

Glu Gly Asp Ala Ala Thr Ala Cys Glu Ala Ser Leu Asp Gin Gin Arq 
180 185 190 

Gin Trp He Leu His Met Ala Lys Leu Ser Ala Ala Met Ala Lys Gin 
195 200 205 

Ala Gin Tyr Val Ala Gin Leu His Val Trp Ala Arg Arg Glu His Pro 
210 215 220 

Thr Tyr Glu Asp He Val Gly Leu Glu Arg Leu Tyr Ala Glu Asn Pro 
225 230 235 240 

Ser Ala Arg Asp Gin He Leu Pro Val Tyr Ala Glu Tyr Gin Gin Arg 
245 250 " 255 

Ser Glu Lys Val Leu Thr Glu Tyr Asn Asn Lys Ala Ala Leu Glu Pro 
260 265 270 

Val Asn Pro Pro Lys Pro Pro Pro Ala He Lys He Asp Pro Pro Pro 
275 280 285 

Pro Pro Gin Glu Gin Gly Leu He Pro Gly Phe Leu Met Pro Pro Ser 
290 295 300 



Asp Gly Ser Gly Val Thr Pro Gly Thr Gly Met Pro Ala Ala Pro Met 
305 310 315 320 

Val Pro Pro Thr Gly Ser Pro Gly Gly Gly Leu Pro Ala Asp Thr Ala 
325 330 335 
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Ala Gin Leu Thr Ser Ala Gly Arg Glu Ala Ala Ala Leu Ser Gly Asp 
340 345 350 

Val Ala Val Lys Ala Ala Ser Leu Gly Gly Gly Gly Gly Gly Gly Val 
355 360 365 

Pro Ser Ala Pro Leu Gly Ser Ala lie Gly Gly Ala Glu Ser Val Arg 
370 375 380 

Pro Ala Gly Ala Gly Asp He Ala Gly Leu Gly Gin Gly Arg Ala Gly 
385 ^ 390 395 400 

Gly Gly Ala Ala Leu Gly Gly Gly Gly Met Gly Met Pro Met Gly Ala 
405 410 415 

Ala His Gin Gly Gin Gly Gly Ala Lys Ser Lys Gly Ser Gin Glri Glu 
420 425 430 



Asp Glu Ala Leu Tyr Thr Glu Asp Arg Ala Trp Thr Glu Ala Val He 
435 440 445 

Gly Asn Arg Arg Arg Gin Asp Ser Lys Glu Ser Lys 
450 455 460 

(2) INFORMATION FOR SEQ ID NO: 180: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 277 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ I 

Ala Gly Asn Val Thr Ser Ala Ser 
1 5 

Asp Arg Gly Ser Gin Arg Arg Arg 
20 

Thr Glu Arg Cys Arg Phe Asp Arg 
35 40 

Phe Pro Pro Ser Arg Arg Gin Leu 

50 55 

Thr Thr Arg Arg Ser Gly Arg Arg 
65 70 

Gly Thr Gly Ser His Thr Gly Ala 



) NO : 1 8 0 : 

Gly Pro His Arg Phe Gly Ala Pro 
10 15 

Arg His Pro Ala Ala Ser Thr Ala 
25 30 

His Val Ala Arg Gin Arg Cys Gly 
45 

Arg Arg Arg Val Ser Arg Glu Ala 
60 

Asn His Arg Cys Gly Trp His Pro 
75 80 

Val Arg Arg Arg His Gin Glu Ala 
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85 90 95 

Arg Asp Gin Ser Leu Leu Leu Arg Arg Arg Gly Arg Val Asp Leu Asp 
100 105 110 

Gly Gly Gly Arg Leu Arg Arg Val Tyr Arg Phe Gin Gly Cys Leu Val 
115 120 125 

Val Val Phe Gly Gin His Leu Leu Arg Pro Leu Leu lie Leu Arg Val 
130 135 140 

His Arg Glu Asn Leu Val Ala Gly Arg Arg Val Phe Arg Val Lys Pro 
145 150 155 160 

Phe Glu Pro Asp Tyr Val Phe lie Ser Arg Met Phe Pro Pro Ser Pro 
165 170 175 

His Val Gin Leu Arg Asp lie Leu Ser Leu Leu Gly His Arg Ser Ala 
180 185 190 

Gin Phe Gly His Val Glu Tyr Pro Leu Pro Leu Leu lie Glu Arg Ser 
195 200 205 

Leu Ala Ser Gly Ser Arg lie Ala Phe Pro Val Val Lys Pro Pro Glu 
210 215 220 

Pro Leu Asp Val Ala Leu Gin Arg Gin Val Glu Ser Val Pro Pro lie 
225 230 235 240 

Arg Lys Val Arg Glu Arg Cys Ala Leu Val Ala Arg Phe Glu Leu Pro 
245 250 255 

Cys Arg Phe Phe Glu lie His Glu Val Gly Phe Thr Gly Arg Gly His 
260 265 270 

Pro Arg Arg lie Gly 
275 

(2) INFORMATION FOR SEQ ID NO: 181: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 192 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 181: 

Arg Val Ala Ala Ser Phe lie Asp Trp Leu Asp Ser Pro Asp Ser Pro 
15 10 15 
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Leu Asp Pro Ser Leu Val Ser Ser Leu Leu Asn Ala Val Ser Cys Gly 
20 25 30 

Ala Glu Ser Ser Ala Ser Ser Ser Ala Arg Ser Gly Asn Gly Ser Arg 
35 40 45 

Trp Thr Ser Met Pro Ser Gly Thr Arg Pro Gly Pro Arg Arg Ala Thr 
50 55 60 

Ser Arg Asp Asp Arg Arg Ser Ala Thr Ser Val lie Pro Ser Arg Arg 
65 "70 75 80 

Ser Val Ala Pro Arg Ala Glu Phe Gly Thr Arg Leu Ala Ser His Arg 
85 90 95 

Ala Ser Pro Ser Asn Ala Cys Pro Val Arg He Val Thr Ser Ala Ser 
100 105 HO 

Gly Arg Pro He Ser Ser Pro Pro He Val Arg Ser Arg Ser Cys Val 
115 120 125 

Asp Lys Asn Gly Arg Arg Cys Ala Ser Gly Tyr Arg Arg Leu Asn Arg 
130 135 140 

Ala Arg Ser Ser Ser He Ala Ala Arg Cys Arg Thr He Gly Thr Phe 
145 " 150 155 160 

Arg Arg Ser Arg Tyr Ser Ala Ser Met Arg Val Ser Thr Asn Ser Pro 
165 170 175 

His Val Thr His Gly Val Ala Pro Gly Val Thr Arg Arg He Gly Gly 
180 185 190 



(2) INFORMATION FOR SEQ ID NO: 182: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 196 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 182: 

Gin Glu Arg Pro Gin Met Cys Gin Arg Val Ser Glu He Glu Pro Arg 
15 10 15 

Thr Gin Phe Phe Asn Arg Cys Ala Leu Pro His Tyr Trp His Phe Pro 
20 25 30 

Ala Val Ala Val Phe Ser Lys His Ala Ser Leu Asp Glu Leu Ala Pro 
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35 40 45 

Arg Asn Pro Arg Arg Ser Ser Arg Arg Asp Ala Glu Asp Arg Arg Val 
50 55 60 

He Phe Ala Ala Thr Leu Val Ala Val Asp Pro Pro Leu Arg Gly Ala 
65 70 75 80 

Gly Gly Glu Ala Asp Gin Leu He Asp Leu Gly Val Cys Arg Arg Gin 
85 90 95 

Ala Gly Arg Val Arg Arg Gly Gin Glu Leu His His Arg His Arg His 
100 105 110 

Gin Gly Ala Ala Pro Asp Leu Arg Arg Arg Arg Arg His Arg Arg Val 
115 120 125 

Gin Gin His Arg Arg Leu Gin Arg Val Arg Gin Leu Arg Arg Tyr' Val 
130 135 140 / 

Gin Thr Ala His His Arg Arg Phe Ala Arg Thr Asp Arg Val Arg His 
145 150 155 160 

His Val Arg Gly Pro Ser Asn His Arg Arg Arg Arg Val Tyr Arg Gly 
165 170 175 

Arg His Ser Gly Ala Gly Gly Cys Pro Ala Gly Gly Ala Gly Ser Val 
180 185 190 

Gly Gly Ser Ala 
195 

(2) INFORMATION FOR SEQ ID NO: 183: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 311 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 183: 

Val Arg Cys Gly Thr Leu Val Pro Val Pro Met Val Glu Phe Leu Thr 
1 5 10 15 

Ser Thr Asn Ala Pro Ser Leu Pro Ser Ala Tyr Ala Glu Val Asp Lys 
20 25 30 

Leu lie Gly Leu Pro Ala Gly Thr Ala Lys Arg Trp lie Asn Gly Tyr 
35 40 45 
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Glu Arg Gly Gly Lys Asp His Pro Pro lie Leu Arg Val Thr Pro Gly 
50 55 60 

Ala Thr Pro Trp Val Thr Trp Gly Glu Phe Val Glu Thr Arg Met Leu 
65 70 75 80 

Ala Glu Tyr Arg Asp Arg Arg Lys Val Pro He Val Arg Gin Arg Ala 
85 90 95 

Ala He Glu Glu Leu Arg Ala Arg Phe Asn Leu Arg Tyr Pro Leu Ala 
100 105 HO 

His Leu Arg Pro Phe Leu Ser Thr His Glu Arg Asp Leu Thr Met Gly 
115 120 125 

Glv Glu Glu He Gly Leu Pro Asp Ala Glu Val Thr He Arg Thr Gly 
130 ' 135 140 r 

! 

Gin Ala Leu Leu Gly Asp Ala Arg Trp Leu Ala Ser Leu Val Pro Asn 
145 150 155 160 

Ser Ala Arg Gly Ala Thr Leu Arg Arg Leu Gly He Thr Asp Val Ala 
165 170 175 

Asp Leu Arg Ser Ser Arg Glu Val Ala Arg Arg Gly Pro Gly Arg Val 
180 185 190 

Pro Asp Gly He Asp Val His Leu Leu Pro Phe Pro Asp Leu Ala Asp 
195 200 205 

Asp Asp Ala Asp Asp Ser Ala Pro His Glu Thr Ala Phe Lys Arg Leu 
2io 215 220 

Leu Thr Asn Asp Gly Ser Asn Gly Glu Ser Gly Glu Ser Ser Gin Ser 
225 230 235 240 

He Asn Asp Ala Ala Thr Arg Tyr Met Thr Asp Glu Tyr Arg Gin Phe 
.245 250 255 

Pro Thr Arg Asn Gly Ala Gin Arg Ala Leu His Arg Val Val Thr Leu 
260 265 270 

Leu Ala Ala Gly Arg Pro Val Leu Thr His Cys Phe Ala Gly Lys Asp 
275 280 285 

Arg Thr Gly Phe Val Val Ala Leu Val Leu Glu Ala Val Gly Leu Asp 
290 " 295 300 



Arg Asp Val He Val Ala Asp 
305. 310 

(2) INFORMATION FOR SEQ ID NO: 184: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2072 base pairs 
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(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 184: 

CTCGTGCCGA TTCGGCACGA GCTGAGCAGC CCAAGGGGCC GTTCGGCGAA GTCATCGAGG 60 

CATTCGCCGA CGGGCTGGCC GGCAAGGGTA AGCAAATCAA CACCACGCTG AACAGCCTGT 120 

CGCAGGCGTT GAACGCCTTG AATGAGGGCC GCGGCGACTT CTTCGCGGTG GTACGCAGCC 180 

TGGCGCTATT CGTCAACGCG CTACATCAGG ACGACCAACA GTTCGTCGCG TTGAACAAGA 24 0 

ACCTTGCGGA GTTCACCGAC AGGTTGACCC ACTCCGATGC GGACCTGTCG AACGCCATCC 300 

AGCAATTCGA CAGCTTGCTC GCCGTCGCGC GCCCGTTCTT CGCCAAGAAC CGCGAGGTGC 360 

TGACGCATGA CGTCAATAAT CTCGCGACCG TGACCACCAC GTTGCTGCAG CCCGATCCGT 4 20 

TGGATGGGTT GGAGACCGTC CTGCACATCT TCCCGACGCT GGCGGCGAAC ATTAACCAGC 48 0 

TTTACCATCC GACACACGGT GGCGTGGTGT CGCTTTCCGC GTTCACGAAT TTCGCCAACC 54 0 

CGATGGAGTT CATCTGCAGC TCGATTCAGG CGGGTAGCCG GCTCGGTTAT CAAGAGTCGG 600 

CCGAACTCTG TGCGCAGTAT CTGGCGCCAG TCCTCGATGC GATCAAGTTC AACTACTTTC 660 

CGTTCGGCCT GAACGTGGCC AGCACCGCCT CGACACTGCC TAAAGAGATC GCGTACTCCG 720 

AGCCCCGCTT GCAGCCGCCC AACGGGTACA AGGACACCAC GGTGCCCGGC ATCTGGGTGC 78 0 

CGGATACGCC GTTGTCACAC CGCAACACGC AGCCCGGTTG GGTGGTGGCA CCCGGGATGC 84 0 

AAGGGGTTCA GGTGGGACCG ATCACGCAGG GTTTGCTGAC GCCGGAGTCC CTGGCCGAAC 900 

TCATGGGTGG TCCCGATATC GCCCCTCCGT CGTCAGGGCT GCAAACCCCG CCCGGACCCC 960 

CGAATGCGTA CGACGAGTAC CCCGTGCTGC CGCCGATCGG TTTACAGGCC CCACAGGTGC 1020 

CGATACCACC GCCGCCTCCT GGGCCCGACG TAATCCCGGG TCCGGTGCCA CCGGTCTTGG 108 0 

CGGCGATCGT GTTCCCAAGA GATCGCCCGG CAGCGTCGGA AAACTTCGAC TACATGGGCC 114 0 

TCTTGTTGCT GTCGCCGGGC CTGGCGACCT TCCTGTTCGG GGTGTCATCT AGCCCCGCCC 120 0 

GTGGAACGAT GGCCGATCGG CACGTGTTGA TACCGGCGAT CACCGGCCTG GCGTTGATCG 12 60 

CGGCATTCGT CGCACATTCG TGGTACCGCA C AG AAC AT C C GCTCATAGAC ATGCGCTTGT 132 0 

TCCAGAACCG AGCGGTCGCG CAGGCCAACA TGACGATGAC GGTGCTCTCC CTCGGGCTGT 138 0 
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TTGGCTCCTT 


CTTGCTGGTC 


CCGAGCTACC 


TCCAGCAAGT 


GTTGCACCAA 


TCACCGATGC 


1440 


AATCGGGGGT 


GCATATCATC 


CCACAGGGCC 


TCGGTGCCAT 


GCTGGCGATG 


CCGATCGCCG 


1500 


GAGCGATGAT 


GGACCGACGG 


GGACCGGCCA 


AGATCGTGCT 


GGTTGGGATC 


ATGCTGATCG 


1560 


CTGCGGGGTT 


GGGCACCTTC 


GCCTTTGGTG 


TCGCGCGGCA 


AGCGGACTAC 


TTACCCATTC 


1620 


thccgaccgg 


GCTGGCAATC 


ATGGGCATGG 


GCATGGGCTG 


CTCCATGATG 


CCACTGTCCG 


1680 


GGGCGGCAGT 


GCAGACCCTG 


GCCCCACATC 


AGATCGCTCG 


CGGTTCGACG 


CTGATCAGCG 


1740 


TCAACCAGCA 


GGTGGGCGGT 


TCGATAGGGA 


CCGCACTGAT 


(j 1 CCjCj 1 vjU 1 yj 


V- 1 LALL 1 ACL. 


i o n n 


AGTTCAATCA 


CAGCGAAATC 


ATCGCTACTG 


CAAAGAAAGT 


CGCACTGACC 


C C AG AG AG T G 

1 


1860 


GCGCCGGGCG 


GGGGGCGGCG 


GTTGACCCTT 


CCTCGCTACC 


GCGCCAAACC 


AACTTCGCGG 

/ 


1920 


CCCAACTGCT 


GCATGACCTT 


TCGCACGCCT 


ACGCGGTGGT 


ATTCGTGATA 


GCGACCGCGC 


1980 


TAGTGGTCTC 


GACGCTGATC 


CCCGCGGCAT 


TCCTGCCGAA 


ACAGCAGGCT 


AGTCATCGAA 


2040 


GAGCACCGTT 


GCTATCCGCA 


TGACGTCTGC 


TT 






2072 



(2) INFORMATION FOR SEQ ID NO: 18 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1923 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 185: 



TCACCCCGGA 


GAAGTCGTTC 


GTCGACGACC 


TGGACATCGA 


CTCGCTGTCG 


ATGGTCGAGA 


60 


TCGCCGTGCA 


GACCGAGGAC 


AAGTACGGCG 


TCAAGATCCC 


CGACGAGGAC 


CTCGCCGGTC 


120 


TGCGTACCGT 


CGGTGACGTT 


GTCGCCTACA 


TCCAGAAGCT 


CGAGGAAGAA 


AACCCGGAGG 


180 


CGGCTCAGGC 


GTTGCGCGCG 


AAGATTGAGT 


CGGAGAACCC 


CGATGCGGCA 


CGAGCAGATC 


240 


GGTGCGTTTC 


ACCCACATCG 


CAAGCTCGAG 


ACGCCCGTCG 


TCCTCTTGCA 


CGCTCAGCCA 


300 


GGTTGGCGTG 


TCGCCGCCTT 


CCAGCAAGTG 


TTCCCACCAC 


ACGAAGGGAC 


CCTCGCGAAA 


360 


GGTGACTGAT 


CCGCGGACCA 


CATAGTCGAT 


GCCACCGTGG 


CTGACAATTG 


CGCCGGGTCC 


420 


GAGTTGGCGG 


GGGCCGAATT 


GCGGCATTGC 


GTCGAAGGCC 


AGCGGATCCC 


GGCGCCCGCC 


480 
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CGGCGTGGCT GGTGTTTTGG GCCGCCGGAT GGCCACGACG AGAACGACGA TGGCGGCGAT 54 0 

GAACAGCGCC ACGGCAATCA CGACCAGCAG ATTTCCCACG CATACCCTCT CGTACCGCTG 600 

CGCCGCGGTT GGTCGATCGG TCGCATATCG ATGGCGCCGT TTAACGTAAC AGCTTTCGCG 660 

GGACCGGGGG TCACAACGGG CGAGTTGTCC GGCCGGGAAC CCGGCAGGTC TCGGCCGCGG 720 

TCACCCCAGC TCACTGGTGC ACCATCCGGG TGTCGGTGAG CGTGCAACTC AAAC AC AC T C 78 0 

AACGGCAACG GTTTCTCAGG TCACCAGCTC AACCTCGACC CGCAATCGCT CGTACGTTTC 84 0 

GACCGCGCGC AGGTCGCGAG TCAGCAGCTT TGCGCCGGCA GCTTTCGCCG TGAAGCCGAC 900 

CAGGGCATCG TAGGTTGCGC CACCGGTGAC ATCGTGCTCG GCGAGGTGGT CGGTCAAGCC 960 

GCGATATGAG CAGGCATCCA GTGCCAGGTA GTTGCTGGAG GTGATGTCCG CCAAGTAGGC 1020 

GTGGACGGCA ACAGGGGCAA TACGATGCGG CGGTGGTAGC CGGGTCAAGA CCGAATAGGT 1080 

TTCCACAGCC GCGTGCGCGA TCAGATGGAC GCCACGGTTG AGCGCGCGCA CGGCGGCCTC 114 0 

GTGCCCTTCG TGCCAGGTCG CGAATCCGGC AACCAGCACG CTGGTGTCTG GTGCGATCAC 1200 

CGCCGTGTGC GATCGAGCGT TTCCCGAACG ATTTCGTCGG TCAACGGGGG CAGGGGACGT 12 60 

TCTGGCCGTG CGACGAGAAC CGAGCCTTCC CGAACGAGTT CGACACCGGT CGGGGCCGGC 1320 

TCAATCTCGA TGCGCCCATC GCGCTCGGTG ATCTCCACCT GGTCGTTCCC GCGCAAGCCA 138 0 

AGGCGCTCGC GAATCCGCTT GGGAATCACC AGACGTCCTG CG AC AT C GAT GGTTGTTCGC 14 4 0 

ATGGTAGGAA ATTTACCATC GCACGTTCCA TAGGCGTGTC CTGCGCGGGA TGTCGGGACG 1500 

ATCCGCTAGC GTATCGAACG ATTGTTTCGG AAATGGCTGA GGGAGCGTGC GGTGCGGGTG 1560 

ATGGGTGTCG ATCCCGGGTT GACCCGATGC GGGCTGTCGC TCATCGAGAG TGGGCGTGGT 1620 

CGGCAGCTCA CCGCGCTGGA TGTCGACGTG GTGCGCACAC CGTCGGATGC GGCCTTGGCG 168 0 

CAGCGCCTGT TGGCCATCAG CGATGCCGTC GAGCACTGGC TGGACACCCA TCATCCGGAG 174 0 

GTGGTGGCTA TCGAACGGGT GTTCTCTCAG CTCAACGTGA CCACGGTGAT GGGCACCGCG 1800 

CAGGCCGGCG GCGTGATCGC CCTGGCGGCG GCCAAACGTG GTGTCGACGT GCATTTCCAT 18 60 

ACCCCCAGCG AGGTCAAGGC GGCGGTCACT GGCAACGGTT CCGCAGACAA GGCTCAGGTC 1920 

ACC 1923 
(2) INFORMATION FOR SEQ ID NO: 186: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1055 base pairs 

(B) TYPE: nucleic acid 
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(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 186: 

CTGGCGTGCC AGTGTCACCG GCGATATGAC GTCGGCATTC AATTTCGCGG CCCCGCCGGA 60 

CCCGTCGCCA CCCAATCTGG ACCACCCGGT CCGTCAATTG CCGAAGGTCG CCAAGTGCGT 120 

GCCCAATGTG GTGCTGGGTT TCTTGAACGA AGGCCTGCCG TATCGGGTGC CCTACCCCCA 180 

AACAACGCCA GTCCAGGAAT CCGGTCCCGC GCGGCCGATT CCCAGCGGCA TCTGCTAGCC 240 

GGGGATGGTT CAGACGTAAC GGTTGGCTAG GTCGAAACCC GCGCCAGGGC CGCTGGACGG 300 

GCTCATGGCA GCG AAAT TAG AAAACCCGGG ATATTGTCCG CGGATTGTCA TACGATGCTG 360 

AGTGCTTGGT GGTTCGTGTT TAGCCATTGA GTGTGGATGT GTTGAGACCC TGGCCTGGAA 4 20 

GGGGACAACG TGCTTTTGCC TCTTGGTCCG CCTTTGCCGC CCGACGCGGT GGTGGCGAAA 480 

CGGGCTGAGT CGGGAATGCT CGGCGGGTTG TCGGTTCCGC TCAGCTGGGG AGTGGCTGTG 54 0 

CCACCCGATG ATTATGACCA CTGGGCGCCT GCGCCGGAGG ACGGCGCCGA TGTCGATGTC 600 

CAGGCGGCCG AAGGGGCGGA CGCAGAGGCC GCGGCCATGG ACGAGTGGGA TGAGTGGCAG 660 

GCGTGGAACG AGTGGGTGGC GGAGAACGCT GAACCCCGCT TTGAGGTGCC ACGGAGTAGC 7 20 

AGCAGCGTGA TTCCGCATTC TCCGGCGGCC GGCTAGGAGA GGGGGCGCAG ACTGTCGTTA 7 80 

TTTGACCAGT GATCGGCGGT CTCGGTGTTC CCGCGGCCGG CTATGACAAC AGTCAATGTG 84 0 

CATGACAAGT TACAGGTATT AGGTCCAGGT TCAACAAGGA GACAGGCAAC ATGGCAACAC 900 

GTTTTATGAC GGATCCGCAC GCGATGCGGG ACATGGCGGG CCGTTTTGAG GTGCACGCCC 960 

AGACGGTGGA GGACGAGGCT CGCCGGATGT GGGCGTCCGC GCAAAACATC TCGGGNGCGG 1020 

GCTGGAGTGG CATGGCCGAG GCGACCTCGC TAGAC 1055 
(2) INFORMATION FOR SEQ ID NO: 187: 

• (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 359 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 187: 

CCGCCTCGTT GTTGGCATAC TCCGCCGCGG CCGCCTCGAC CGCACTGGCC GTGGCGTGTG 60 

TCCGGGCTGA CCACCGGGAT CGCCGAACCA TCCGAGATCA CCTCGCAATG ATCCACCTCG 120 

CGCAGCTGGT CACCCAGCCA CCGGGCGGTG TGCGAC AG CG CCTGCATCAC CTTGGTATAG 180 

CCGTCGCGCC CCAGCCGCAG GAAGTTGTAG TACTGGCCCA CCACCTGGTT ACCGGGACGG 24 0 

GAGAAGTTCA GGGTGAAGGT CGGCATGTCG CCGCCGAGGT AGTTGACCCG GAAAACCAGA 300 

TCCTCCGGCA GGTGCTCGGG CCCGCGCCAC ACGACAAACC CGACGCCGGG ATAGGTCAG 35 9 
(2) INFORMATION FOR SEQ ID NO: 188: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 350 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 188: 

AACGGGCCCG TGGGCACCGC TCCTCTAAGG GCTCTCGTTG GTCGCATGAA GTGCTGGAAG 60 

GATGCATCTT GGCAGATTCC CGCCAGAGCA AAACAGCCGC TAGTCCTAGT CCGAGTCGCC 120 

CGCAAAGTTC CTCGAATAAC TCCGTACCCG GAGCGCCAAA CCGGGTCTCC TTCGCTAAGC 180 

TGCGCGAACC ACTTGAGGTT CCGGGACTCC TTGACGTCCA GACCGATTCG TTCGAGTGGC 24 0 

TGATCGGTTC GCCGCGCTGG CGCGAATCCG CCGCCGAGCG GGGTGATGTC AACCCAGTGG 300 

GTGGCCTGGA AGAGGTGCTC TACGAGCTGT CTCCGATCGA GGACTTCTCC 350 
(2) INFORMATION FOR SEQ ID NO: 18 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 679 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 189: 

BNSOOCID: <WO 9816645A2_I_> 
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Glu Gin Pro Lys Gly Pro Phe Gly Glu Val He Glu Ala Phe Ala Asp 
1 5 10 15 

Gly Leu Ala Gly Lys Gly Lys Gin He Asn Thr Thr Leu Asn Ser Leu 
20 25 30 

Ser Gin Ala Leu Asn Ala Leu Asn Glu Gly Arg Gly Asp Phe Phe Ala 
35 40 45 

Val Val Arg Ser Leu Ala Leu Phe Val Asn Ala Leu His Gin Asp Asp 
50 " 55 60 

Gin Gin Phe Val Ala Leu Asn Lys Asn Leu Ala Glu Phe Thr Asp Arg 
65 70 75 80 

Leu Thr His Ser Asp Ala Asp Leu Ser Asn Ala He Gin Gin Phe Asp 
85 90 95/ 

Ser Leu Leu Ala Val Ala Arg . Pro Phe Phe Ala Lys Asn Arg Glu Val 
100 105 HO 

Leu Thr His Asp Val Asn Asn Leu Ala Thr Val Thr Thr Thr Leu Leu 
115 120 125 

Gin Pro Asp Pro Leu Asp Gly Leu Glu Thr Val Leu His lie Phe Pro 
130 135 140 

Thr Leu Ala Ala Asn He Asn Gin Leu Tyr His Pro Thr His Gly Gly 
145 150 155 160 

Val Val Ser Leu Ser Ala Phe Thr Asn Phe Ala Asn Pro Met Glu Phe 
165 170 175 

He Cys Ser Ser He Gin Ala Gly Ser Arg Leu Gly Tyr Gin Glu Ser 
180 185 190 

Ala Glu Leu Cys Ala Gin Tyr Leu Ala Pro Val Leu Asp Ala He Lys 
195 200 205 

Phe Asn Tyr Phe Pro Phe Gly Leu Asn Val Ala Ser Thr Ala Ser Thr 
210 215 220 

Leu Pro Lys Glu He Ala Tyr Ser Glu Pro Arg Leu Gin Pro Pro Asn 
225 230 235 240 

Gly Tyr Lys Asp Thr Thr Val Pro Gly He Trp Val Pro Asp Thr Pro 
245 250 255 

Leu Ser His Arg Asn Thr Gin Pro Gly Trp Val Val Ala Pro Gly Met 
260 265 270 

Gin Gly Val Gin Val Gly Pro He Thr Gin Gly Leu Leu Thr Pro Glu 
275 280 285 
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Ser Leu Ala Glu Leu Met Gly Gly Pro Asp lie Ala Pro Pro Ser Ser 
290 295 300 

Gly Leu Gin Thr Pro Pro Gly Pro Pro Asn Ala Tyr Asp Glu Tyr Pro 
305 310 315 320 

Val Leu Pro Pro lie Gly Leu Gin Ala Pro Gin Val Pro lie Pro Pro 
325 330 335 

Pro Pro Pro Gly Pro Asp Val He Pro Gly Pro Val Pro Pro Val Leu 
340 345 350 



Ala Ala He Val Phe Pro Arg Asp Arg Pro Ala Ala Ser Glu Asn Phe 
355 360 365 

Asp Tyr Met Gly Leu Leu Leu Leu Ser Pro Gly Leu Ala Thr Phe Leu 
370 375 380 

Phe Gly Val Ser Ser Ser Pro Ala Arg Gly Thr Met Ala Asp Arg His 
385 390 395 400 

Val Leu He Pro Ala He Thr Gly Leu Ala Leu He Ala Ala Phe Val 
405 410 415 

Ala His Ser Trp Tyr Arg Thr Glu His Pro Leu He Asp Met Arg Leu 
420 425 430 

Phe Gin Asn Arg Ala Val Ala Gin Ala Asn Met Thr Met Thr Val Leu 
435 440 445 

Ser Leu Gly Leu Phe Gly Ser Phe Leu Leu Leu Pro Ser Tyr Leu Gin 
450 455 460 

Gin Val Leu His Gin Ser Pro Met Gin Ser Gly Val His He He Pro 
465 470 475 480 

Gin Gly Leu Gly Ala Met Leu Ala Met Pro He Ala Gly Ala Met Met 
485 490 495 

Asp Arg Arg Gly Pro Ala Lys He Val Leu Val Gly He Met Leu He 
500 505 . 510 

Ala Ala Gly Leu Gly Thr Phe Ala Phe Gly Val Ala Arg Gin Ala Asp 
515 520 525 

Tyr Leu Pro He Leu Pro Thr Gly Leu Ala He Met Gly Met Gly Met 
530 535 540 

Gly Cys Ser Met Met Pro Leu Ser Gly Ala Ala Val Gin Thr Leu Ala 
545 550 555 560 

Pro His Gin He Ala Arg Gly Ser Thr Leu He Ser Val Asn Gin Gin 
565 570 575 



Val Gly Gly Ser lie Gly Thr Ala Leu Met Ser Val Leu Leu Thr Tyr 
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585 

lie lie Ala Thr Ala 
600 

Gly Arg Gly Ala Ala 
615 

Phe Ala Ala Gin Leu 
635 



590 

Lys Lys Val Ala Leu 
605 

Val Asp Pro Ser Ser 
620 

Leu His Asp Leu Ser 
640 



580 

Gin Phe Asn His Ser Glu 
595 

Thr Pro Glu Ser Gly Ala 
610 

Leu Pro Arg Gin Thr Asn 
625 630 

His Ala Tyr Ala Val Val Phe Val 
645 

Thr Leu lie Pro Ala Ala Phe Leu 
660 

Arg Ala Pro Leu Leu Ser Ala 
675 

(2) INFORMATION FOR SEQ ID NO: 190: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 120 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 



lie Ala Thr Ala Leu Val Val Ser 
650 655 

Pro Lys Gin Gin Ala Ser His Arg 
665 670 



(xi) SEQUENCE DE, 

Thr Pro Glu Lys 
1 

Met Val Glu lie 
20 

Pro Asp Glu Asp 
35 

Tyr lie Gin Lys 
50 

Arg Ala Lys lie 
65 

Cys Val Ser Pro 



Arg Ser Ala Arg 
100 



JCRIPTION: SEQ II 

Ser Phe Val Asp 
5 

Ala Val Gin Thr 



Leu Ala Gly Leu 
40 

Leu Glu Glu Glu 
55 

Glu Ser Glu Asn 
70 

Thr Ser Gin Ala 
85 

Leu Ala Cys Arg 



) NO: 190: 

Asp Leu Asp lie 
10 

Glu Asp Lys Tyr 
25 

Arg Thr Val Gly 



Asn Pro Glu Ala 
60 

Pro Asp Ala Ala 
75 

Arg Asp Ala Arg 
90 

Arg Leu Pro Ala 
105 



Asp Ser Leu Ser 
15 

Gly Val Lys lie 
30 

Asp Val Val Ala 
45 

Ala Gin Ala Leu 



Arg Ala Asp Arg 
80 

Arg Pro Leu Ala 
95 

Ser Val Pro Thr 
110 
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Thr Arg Arg Asp Pro Arg Glu Arg 
115 120 

(2) INFORMATION FOR SEQ ID NO: 191: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 89 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 191: 

Leu Ala Cys Gin Cys His Arg Arg Tyr Asp Val Gly He Gin Phe Arg 
1 ~ 5 10 15' 

J 

Gly Pro Ala Gly Pro Val Ala Thr Gin Ser Gly Pro Pro Gly Pro Ser 
20 25 30 

He Ala Glu Gly Arg Gin Val Arg Ala Gin Cys Gly Ala Gly Phe Leu 
35 40 45 

Glu Arg Arg Pro Ala Val Ser Gly Ala Leu Pro Pro Asn Asn Ala Ser 
50 55 60 

Pro Gly He Arg Ser Arg Ala Ala Asp Ser Gin Arg His Leu Leu Ala 
65 1 70 "75 80 

Gly Asp Gly Ser Asp Val Thr Val Gly 
85 

(2) INFORMATION FOR SEQ ID NO: 192: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 119 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 192: 

Ala Ser Leu Leu Ala Tyr Ser Ala Ala Ala Ala Ser Thr Ala Leu Ala 
15 10 15 

Val Ala Cys Val Arg Ala Asp His Arg Asp Arg Arg Thr He Arg Asp 
20 25 30 
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His Leu Ala Met lie His Leu Ala Gin Leu Val Thr Gin Pro Pro Gly 
35 40 45 

Glv Val Arg Gin Arg Leu His His Leu Gly He Ala Val Ala Pro Gin 
50 55 60 

Pro Gin Glu Val Val Val Leu Ala His His Leu Val Thr Gly Thr Gly 
65 7 0 75 80 

Glu Val Gin Gly Glu Gly Arg His Val Ala Ala Glu Val Val Asp Pro 
85 90 95 

Glu Asn Gin He Leu Arg Gin Val Leu Gly Pro Ala Pro His Asp Lys 
100 105 HO 

Pro Asp Ala Gly He Gly Gin 

. / 



(2) INFORMATION FOR SEQ ID NO: 193: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 116 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 193: 

Arq Ala Arg Gly His Arg Ser Ser Lys Gly Ser Arg Trp Ser His Glu 
1 5 10 i5 

Val Leu Glu Gly Cys He Leu Ala Asp Ser Arg Gin Ser Lys Thr Ala 
20 25 30 

Ala Ser Pro Ser Pro Ser Arg Pro Gin Ser Ser Ser Asn Asn Ser Val 
35 40 45 

Pro Gly Ala Pro Asn Arg Val Ser Phe Ala Lys Leu Arg Glu Pro Leu 
50 55 60 



Glu Val Pro Gly Leu Leu Asp Val Gin Thr Asp Ser Phe Glu Trp Leu 
65 



70 75 80 



lie Gly Ser Pro Arg Trp Arg Glu Ser Ala Ala Glu Arg Gly Asp Val 
85 90 95 

Asn Pro Val Gly Gly Leu Glu Glu Val Leu Tyr Glu Leu Ser Pro He 
100 105 HO 



Glu Asp Phe Ser 
115 
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(2) INFORMATION FOR SEQ ID NO: 194: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 811 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 194: 





CAATCGCTTT 


GGTGACAGAT 


GTGGATGCCG 


GCGTCGCTGC 


TGGCGATGGC 


60 


GTGAAAGCCG 


CCGACGTGTT 


CGCCGCATTC 


GGGGAGAACA 


TCtaAAC 1 VjL, 1 




-L £. kj 


GTGCGGGCCG 


CCATCGATCG 


GGTCGCCGAC 


GAGCGCACGT 


GCACGCACTG 


TCAACACCAC 


180 


GCCGGTGTTC 


CGTTGCCGTT 


CGAGCTGCCA 


TGAGGGTGCT 


GCTGACCGGC 


GCGGCCGGCT 


240 


TCATCGGGTC 


GCGCGTGGAT 


GCGGCGTTAC 


GGGCTGCGGG 


TCACGACGTG 


GTGGGCGTCG 


300 


ACGCGCTGCT 


GCCCGCCGCG 


CACGGGCCAA 


ACCCGGTGCT 


GCCACCGGGC 


TGCCAGCGGG 


360 


TCGACGTGCG 


CGACGCCAGC 


GCGCTGGCCC 


CGTTGTTGGC 


CGGTGTCGAT 


CTGGTGTGTC 


420 


ACCAGGCCGC 


CATGGTGGGT 


GCCGGCGTCA 


ACGCCGCCGA 


CGCACCCGCC 


TATGGCGGCC 


480 


ACAACGATTT 


CGCCACCACG 


GTGCTGCTGG 


CGCAGATGTT 


CGCCGCCGGG 


GTCCGCCGTT 


540 


TGGTGCTGGC 


GTCGTCGATG 


GTGGTTTACG 


GGCAGGGGCG 


CTATGACTGT 


CCCCAGCATG 


600 


GACCGGTCGA 


CCCGCTGCCG 


CGGCGGCGAG 


CCGACCTGGA 


CAATGGGGTC 


TTCGAGCACC 


660 


GTTGCCCGGG 


GTGCGGCGAG 


CCAGTCATCT 


GGCAATTGGT 


CGACGAAGAT 


GCCCCGTTGC 


720 


GCCCGCGCAG 


CCTGTACGCG 


GCAGCAAGAC 


CGCGCAGGAG 


CACTACGCGC 


TGGCGTGGTC 


780 


GGAAACGAAT 


GGCGGTTCCG 


TGGTGGCGTT 


G 






811 



(2) INFORMATION FOR SEQ ID NO: 195: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 966 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 195: 



GTCCCGCGAT 


GTGGCCGAGC 


ATGACTTTCG 


GCAACACCGG 


CGTAGTAGTC 


GAAGATATCG 


60 


GACTTTGTGG 


TCCCGGTGGC 


GGGATAGAGC 


ACCTGTCGGC 


GTTGGTCAGC 


GTCACCCGTT 


120 


GCTCGGACGC 


CGAACCCATG 


CTTTCAACGT 


AGCCTGTCGG 


TCACACAAGT 


CGCGAGCGTA 


180 


ACGTCACGGT 


CAAATATCGC 


GTGGAATTTC 


GCCGTGACGT 


TCCGCTCGCG 


GACAATCAAG 


240 


GCATACTCAC 


TTACATGCGA 


GCCATTTGGA 


CGGGTTCGAT 


CGCCTTCGGG 


CTGGTGAACG 


300 


TGCCGGTCAA 


GGTGTACAGC 


GCTACCGCAG 


ACCACGACAT 


CAGGTTCCAC 


CAGGTGCACG 


360 


CCAAGGACAA 


CGGACGCATC 


CGGTACAAGC 


GCGTCTGCGA 


GGCGTGTGGC 


GAGGTGGTCG 


420 


ACTACCGCGA 


TCTTGCCCGG 


GCCTACGAGT 


CCGGCGACGG 


CCAAATGGTG 


GCGATCACCG 

/ 


480 


AC G AC G AC AT 


CGCCAGCTTG 


CCTGAAGAAC 


GCAGCCGGGA 


GATCGAGGTG 


TTGGAGTT,CG 


540 


TCCCCGCCGC 


CGACGTGGAC 


CCGATGATGT 


TCGACCGCAG 


CTACTTTTTG 


GAGCCTGATT 


600 






GTGCTGCTGG 


CTAAGACACT 


CGCCGAGACC 


GACCGGATGG 


660 


CGATCGTGGA 


TCGCCCCACC 


GGCCGTGAAT 


GCAGGAAAAA 


TAAGAGCCGC 


TATCCACAAT 


720 


TCGGCGTCGA 


GCTCGGCTAC 


CACAAACGGT 


AGAACGATCG 


AGACATTCCC 


GAGCTGAAGT 


780 


GCGGCGCTAT 


AGAAGCCGCT 


CTGCGCGATT 


AT C AAAC G C A 


AAATACGCTT 


ACTCATGCCA 


840 


TCGGCGCTGC 


TCACCCGATG 


CGACGTTTTT 


GCCACGCTCC 


ACCGCCTGCC 


GCGCGACCTC 


900 


AAGTGGGCAT 


GCATCCCACC 


CGTTCCCGGA 


AACCGGTTCC 


GGCGGGTCGG 


CTCATCGCTT 


960 
966 



(2) INFORMATION FOR SEQ ID NO: 196: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 2367 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:196: 
CCGCACCGCC GGCAATACCG CCAGCGCCAC CGTTACCGCC GTTTGCGCCG TTGCCCCCGT 
TGCCGCCCGT CCCGCCGGCC CCGCCGATGG AGTTCTCATC GCCAAAAGTA CTGGCGTTGC 
CACCGGAGCC GCCGTTGCCG CCGTCACCGC CAGCCCCGCC GACTCCACCG GCCCCACCGA 



WO 98/16645 



208 



PCT/US97/18214 



CTCCGCCGCT GCCACCGTTG CCGCCGTTGC CGATCAACAT GCCGCTGGCG CCACCCTTGC 24 0 

CACCCACGCC ACCGGCTCCG CCCACCCCGC CGACACCAAG CGAGCTGCCG CCGGAGCCAC 300 

CATCACCACC TACGCCACCG ACCGCCCAGA CACCAGCGAC CGGGTCTTCG TGAAACGTCG 360 

CGGTGCCACC ACCGCCGCCG TTACCGCCAA CCCCACCGGC AACGCCGGCG CCGCCATCCC 4 20 

CGCCGGCCCC GGCGTTGCCG CCGTTGCCGC CGTTGCCGAA CAACAACCCG CCGGCGCCGC 480 

CGTTGCCGCC CGCGCCGCCG GTCCCGCCGG CGCCGCCGAC GCCAAGGCCG CTGCCGCCCT 54 0 

TGCCGCCATC ACCACCCTTG CCGCCGACCA CATCGGGTTC TGCCTCGGGG TCTGGGCTGT 600 

CAAACCTCGC GATGCCAGCG TTGCCGCCGC TTCCCCCGGG CCCCCCCGTG GCGCCGTCAC 660 

CACCGATACC ACCCGCGCCA CCGGCGCCAC CGTTGCCGCC ATCACCGAAT AGCAACCCGC 7 20 

CGGCGCCACC ATTGCCGCCA GCTCCCCCTG CGCCACCGTC GGCGCCGGAG GCGGCACTGG 780 

CAGCCCCGTT ACCACCGAAA CCGCCGCTAC CACCGGTAGA GGTGGCAGTG GCGATGTGTA 84 0 

CGAAAGCGCC GCCTCCGGCG CCGCCGCTAC CACCCCCACT GCCGGCGGCT ACACCGTCGG 900 

ACCCGTTGCC ACCATCACCG CCAAAGGCGC TCGCAATGTC GCCCTGCGCG ACTCCGCCGT 960 

CGCCGCCGTT GCCGCCGCCG CCACCGGCAG CGGCGGTACC GCCGTCACCA CCGGCACCGC 1020 

CGGTGGCCTT GCCCGAGCCT GCCGTCGCGG TGGCACCGTC GCCGCCGGTG CCACCGGTCG 108 0 

GCGTGCCGGC AGTGCCATGG CCGCCCGTGC CGCCGTCGCC GCCGGTTTGA TCACCGATGC 114 0 

CGGACACATC TGCCGGGCTG TCCCCGGTGC TGGCCGCGGG GCCGGGCGTG GGATTGACCC 1200 

CGTTTGCCCC GGCGAGGCCG GCGCCGCCGG TACCACCGGC GCCGCCATGG CCGAACAGCC 12 60 

CGGCGTTGCC GCCGTTACCG CCCGCACCCC CGATGCCTGC GGCCACGCTG GTGCCGCCGA 1320 

CACCGCCGTT GCCGCCGTTG CCCCACAACC ACCCCCCGTT CCCACCGGCA CCGCCGGCCG 1380 

CGCCGGTACC ACCGGCCCCG CCGTTGCCGC CGTTGCCGAT CAACCCGGCC GCGCCTCCGC 14 4 0 

TGCCGCCGGT TTGACCGAAC CCGCCAGCCG CGCCGTTGCC ACCGTTGCCA AACAGCAACC 1500 

CGCCGGCCGC GCCAGGCTGC CCGGGTGCCG TCCCGTCGGC GCCGTTTCCG ATCAACGGGC 1560 

GCCCCAAAAG CGCCTCGGTG GGCGCATTCA CCGCACCCAG CAGACTCCGC TCAACAGCGG 1620 

CTTCAGTGCT GGCATACCGA CCCGCGGCCG CAGTCAACGC CTGCACAAAC TGCTCGTGAA 1680 

ACGCTGCCAC CTGTACGCTG AGCGCCTGAT ACTGCCGAGC ATGGGCCCCG AACAACCCCG 17 4 0 

CAATCGCCGC CGACACTTCA TCGGCAGCCG CAGCCACCAC TTCCGTCGTC GGGATCGCCG 1800 
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CGGCCGCATT 


AGCCGCGCTC ACCTGCGAAC 


CAATAGTCGA 


TAAATCCAAA 


GCCGCAGTTG 


1860 


CCAGCAGCTG 


CGGCGTCGCG 


ATCACCAAGG 


ACACCTCGCA 


CCTCCGGATA 


CCCCATATCG 


1920 


CCGCACCGTG 


TCCCCAGCGG 


CCACGTGACC 


TTTGGTCGCT 


GGCTGGCGGC 


CCTGACTATG 


1980 


GCCGCGACGG 


CCCTCGTTCT 


GATTCGCCCC 


GGCGCGCAGC 


TTGTTGCGCG 


AG T T G AAG AC 


2040 


GGGAGGACAG 


GCCGAGCTTG 


GTGTAGACGT 


GGGTCAAGTG 


GGAATGCACG 


GTCCGCGGCG 


2100 


AGATGAATAG 


GCGGACGCCG 


ATCTCCTTGT 


TGCTGAGTCC 


CTCACCGACC 


AGTAGAGCCA 


2160 


CCTCAAGCTC 


TGTCGGTGTC 


AACGCGCCCC 


AGCCACTTGT 


CGGGCGTTTC 


CGTGCACCGC 


2220 


GGCCTCGTTG 


CGCGTACGCG 


ATCGCCTCAT 


CGATCGATAA 


CGCAGTTCCT 


TCGGCCCAGG 


2280 


CATCGTCGAA 


CTCGCTGTCA 


CCCATGGATT 


TTCGAAGGGT GGCTAGCGAC GAGTTACAGC 


2340 


CCGCCTGGTA 


GATCCCGAAG 


CGGACCG 








2367 


(2) INFORMATION FOR SEQ ID NO: 197: 









(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 37 6 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 197: 

Gin Pro Ala Gly Ala Thr He Ala Ala Ser Ser Pro Cys Ala Thr Val 
1 5 10 15 . 

Gly Ala Gly Gly Gly Thr Gly Ser Pro Val Thr Thr Glu Thr Ala Ala 
20 25 • 30 

Thr Thr Gly Arg Gly Gly Ser Gly Asp Val Tyr Glu Ser Ala Ala Ser 
35 40 45 

Glv Ala Ala Ala Thr Thr Pro Thr Ala Gly Gly Tyr Thr Val Gly Pro 
Y so 55 60 

Val Ala Thr He Thr Ala Lys Gly Ala Arg Asn Val Ala Leu Arg Asp 
65 70 " "75 80 

Ser Ala Val Ala Ala Val Ala Ala Ala Ala Thr Gly Ser Gly Gly Thr 
85 90 95. 

Ala Val Thr Thr Gly Thr Ala Gly Gly Leu Ala Arg Ala Cys Arg Arg 
100 ' 105 HO 
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Gly Gly Thr Val Ala Ala Gly Ala Thr Gly Arg Arg Ala Gly Ser Ala 
115 120 125 

Met Ala Ala Arg Ala Ala Val Ala Ala Gly Leu lie Thr Asp Ala Gly 
130 135 140 

His lie Cys Arg Ala Val Pro Gly Ala Gly Arg Gly Ala Gly Arg Gly 
145 150 155 160 

lie Asp Pro Val Cys Pro Gly Glu Ala Gly Ala Ala Gly Thr Thr Gly 
165 170 175 

Ala Ala Met Ala Glu Gin Pro Gly Val Ala Ala Val Thr Ala Arg Thr 
180 185 190 

Pro Asp Ala Cys Gly His Ala Gly Ala Ala Asp Thr Ala Val Ala Ala 
195 200 205 

/ 

Val Ala Pro Gin Pro Pro Pro Val Pro Thr Gly Thr Ala Gly Arg Ala 
210 215 220 / 

Gly Thr Thr Gly Pro Ala Val Ala Ala Val Ala Asp Gin Pro Gly Arg 
225 230 235 240 

Ala Ser Ala Ala Ala Gly Leu Thr Glu Pro Ala Ser Arg Ala Val Ala 
245 250 255 

Thr Val Ala Lys Gin Gin Pro Ala Gly Arg Ala Arg Leu Pro Gly Cys 
260 265 270 

Arg Pro Val Gly Ala Val Ser Asp Gin Arg Ala Pro Gin Lys Arg Leu 
275 280 285 

Gly Gly Arg lie His Arg Thr Gin Gin Thr Pro Leu Asn Ser Gly Phe 
290 295 300 

Ser Ala Gly lie Pro Thr Arg Gly Arg Ser Gin Arg Leu His Lys Leu 
305 310 315 320 

Leu Val Lys Arg Cys His Leu Tyr Ala Glu Arg Leu lie Leu Pro Ser 
325 330 335 

Met Gly Pro Glu Gin Pro Arg Asn Arg Arg Arg His Phe lie Gly Ser 
340 345 350 

Arg Ser His His Phe Arg Arg Arg Asp Arg Arg Gly Arg lie Ser Arg 
355 360 365 

Ala His Leu Arg Thr Asn Ser Arg 
370 375 

(2) INFORMATION FOR SEQ ID NO: 198: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2852 base pairs 
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(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 198: 



GGCCAAAACG 


CCCCGGCGAT 


CGCGGCCACC 


GAGGCCGCCT 


ACGACCAGAT 


GTGGGCCCAG 


60 


GACGTGGCGG 


CGATGTTTGG 


CTACCATGCC 


GGGGCTTCGG 


CGGCCGTCTC 


GGCGTTGACA 


120 


CCGTTCGGCC 


AGGCGCTGCC 


GACCGTGGCG 


GGCGGCGGTG 


CGCTGGTCAG 


CGCGGCCGCG 


180 


GCTCAGGTGA 


CCACGCGGGT 


CTTCCGCAAC 


CTGGGCTTGG 


CGAACGTCCG 


CGAGGGCAAC 

/ 


240 


GTCCGCAACG 


GTAATGTCCG 


GAACTTCAAT 


CTCGGCTCGG 


CCAACATCGG 


/ 

CAACGGCAAC 


300 


ATCGGCAGCG 


GCAACATCGG 


CAGCTCCAAC 


ATCGGGTTTG 


GCAACGTGGG 


TCCTGGGTTG 


360 


ACCGCAGCGC 


TGAACAACAT 


CGGTTTCGGC 


AACACCGGCA 


GCAACAACAT 


CGGGTTTGGC 


420 


AACACCGGCA 


GCAACAACAT 


CGGGTTCGGC 


AATACCGGAG 


ACGGCAACCG 


AGGTATCGGG 


480 


CTCACGGGTA 


GCGGTTTGTT 


GGGGTTCGGC 


GGCCTGAACT 


CGGGCACCGG 


CAACATCGGT 


540 


CTGTTCAACT 


CGGGCACCGG 


AAACGTCGGC 


ATCGGCAACT 


CGGGTACCGG 


GAACTGGGGC 


600 


ATTGGCAACT 


CGGGCAACAG 


CTACAACACC 


GGTTTTGGCA 


ACTCCGGCGA 


CGCCAACACG 


660 


GGCTTCTTCA 


ACTCCGGAAT 


AGCCAACACC 


GGCGTCGGCA ACGCCGGCAA 


CTACAACACC 


720 


GGTAGCTACA 


ACCCGGGCAA 


CAGCAATACC 


GGCGGCTTCA 


ACATGGGCCA 


G T AC AACACG 


780 


GGCTACCTGA 


ACAGCGGCAA 


CTACAACACC 


GGCTTGGCAA 


ACTCCGGCAA 


TGTCAACACC 


840 


GGCGCCTTCA 


TTACTGGCAA 


CTTCAACAAC 


GGCTTCTTGT 


GGCGCGGCGA 


CCACCAAGGC 


900 


CTGATTTTCG 


GGAGCCCCGG 


CTTCTTCAAC 


TCGACCAGTG 


CGCCGTCGTC 


GGGATTCTTC 


960 


AACAGCGGTG 


CCGGTAGCGC 


GTCCGGCTTC 


CTGAACTCCG 


GTGCCAACAA 


TTCTGGCTTC 


1020 


TTCAACTCTT 


CGTCGGGGGC 


CATCGGTAAC 


TCCGGCCTGG 


CAAACGCGGG 


CGTGCTGGTA 


1080 


TCGGGCGTGA 


TCAACTCGGG 


CAACACCGTA 


TCGGGTTTGT 


TC AAC AT GAG 


CCTGGTGGCC 


1140 


ATCACAACGC 


CGGCCTTGAT 


CTCGGGCTTC 


TTCAACACCG 


GAAGCAACAT 


GTCGGGATTT 


1200 


TTCGGTGGCC 


CACCGGTCTT 


CAATCTCGGC 


CTGGCAAACC 


GGGGCGTCGT 


GAACATTCTC 


1260 


GGCAACGCCA ACATCGGCAA 


T T AC AAC AT T 


CTCGGCAGCG 


GAAACGTCGG 


TGACTTCAAC 


1320 


ATCCTTGGCA 


GCGGCAACCT 


CGGCAGCCAA 


AACATCTTGG 


GCAGCGGCAA 


, CGTCGGCAGC 


1380 
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TTCAATATCG GCAGTGGAAA CATCGGAGTA TTCAATGTCG GTTCCGGAAG CCTGGGAAAC 14 40 

TACAACATCG GATCCGGAAA CCTCGGGATC TACAACATCG GTTTTGGAAA CGTCGGCGAC 1500 

TACAACGTCG GCTTCGGGAA CGCGGGCGAC TTCAACCAAG GCTTTGCCAA CACCGGCAAC 1560 

AACAACATCG GGTTCGCCAA CACCGGCAAC AACAACATCG GCATCGGGCT GTCCGGCGAC 1620 

AACCAGCAGG GCTTCAATAT TGCTAGCGGC TGGAACTCGG GCACCGGCAA CAGCGGCCTG 1680 

TTCAATTCGG GCACCAATAA CGTTGGCATC TTCAACGCGG GCACCGGAAA CGTCGGCATC 17 4 0 

GCAAACTCGG GCACCGGGAA CTGGGGTATC GGGAACCCGG GTACCGACAA TACCGGCATC 1800 

CTCAATGCTG GCAGCTACAA CACGGGCATC CTCAACGCCG GCGACTTCAA CACGGGCTTC 18 60 

TACAACACGG GCAGCTACAA CACCGGCGGC TTCAACGTCG GTAACACCAA CACCGGCAAC 1920 

TTCAACGTGG GTGACACCAA TACCGGCAGC TATAACCCGG GTGACACCAA CACCGGCTTC 1980 

TTCAATCCCG GCAACGTCAA TACCGGCGCT TTCGACACGG GCGACTTCAA C AATGGCT T C 204 0 

TTGGTGGCGG GCGATAACCA GGGCCAGATT GCCATCGATC TCTCGGTCAC CACTCCATTC 2100 

ATCCCCATAA ACGAGCAGAT GGTCATTGAC G T AC AC AACG TAATGACCTT CGGCGGCAAC 2160 

ATG AT CACGG TCACCGAGGC CTCGACCGTT TTCCCCCAAA CCTTCTATCT GAGCGGTTTG 2220 

TTCTTCTTCG GCCCGGTCAA TCTCAGCGCA TCCACGCTGA CCGTTCCGAC GATCACCCTC 2280 

ACCATCGGCG GACCGACGGT GACCGTCCCC ATCAGCATTG TCGGTGCTCT GGAGAGCCGC 234 0 

ACGATTACCT TCCTCAAGAT CGATCCGGCG CCGGGCATCG GAAATTCGAC CACCAACCCC 2400 

TCGTCCGGCT TCTTCAACTC GGGCACCGGT GGCACATCTG GCTTCCAAAA CGT CGGCGGC 2 4 60 

GGCAGTTCAG GCGTCTGGAA CAGTGGTTTG AGCAGCGCGA TAGGGAATTC GGGTTTCCAG 2520 

AACCTCGGCT CGCTGCAGTC AGGCTGGGCG AACCTGGGCA ACTCCGTATC GGGCTTTTTC 2580 

AACACCAGTA CGGTGAACCT CTCCACGCCG GCCAATGTCT CGGGCCTGAA CAACATCGGC 2 64 0 

ACCAACCTGT CCGGCGTGTT CCGCGGTCCG ACCGGGACGA TTTTCAACGC GGGCCTTGCC 27 00 

AACCTGGGCC AGTTGAACAT CGGCAGCGCC TCGTGCCGAA TTCGGCACGA GTTAGATACG 27 60 

GTTTCAACAA TCATATCCGC GTTTTGCGGC AGTGCATCAG ACGAATCGAA CCCGGGAAGC 2820 
GTAAGCGAAT AAACCGAATG GCGGCCTGTC AT 
(2) INFORMATION FOR SEQ ID NO: 199: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 943 amino acids 
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(B) TYPE: amino acid 

(C) STRANDEDNESS: 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 199: 

Glv Gin Asn Ala Pro Ala He Ala Ala Thr Glu Ala Ala Tyr Asp Gin 
! 5 10 15 

Met Trp Ala Gin Asp Val Ala Ala Met Phe Gly Tyr His Ala Gly Ala 
20 25 30 

Ser Ala Ala Val Ser Ala Leu Thr Pro Phe Gly Gin Ala Leu Pro Thr 
35 4 0 45 / 

Val Ala Gly Gly Gly Ala Leu Val Ser Ala Ala Ala Ala Gin Val Thr m 
50 " 55 60 

Thr Arg Val Phe Arg Asn Leu Gly Leu Ala Asn Val Arg Glu Gly Asn 
65 70 75 80 

Val Arg Asn Gly Asn Val Arg Asn Phe Asn Leu Gly Ser Ala Asn He 
8 5 90 95 

Gly Asn Gly Asn He Gly Ser Gly Asn He Gly Ser Ser Asn He Gly 
100 105 HO 

Phe Gly Asn Val Gly Pro Gly Leu Thr Ala Ala Leu Asn Asn He Gly 
115 120 125 

Phe Gly Asn Thr Gly Ser Asn Asn He Gly Phe Gly Asn Thr Gly Ser 
130 135 140 

Asn Asn He Gly Phe Gly Asn Thr Gly Asp Gly Asn Arg Gly He Gly 
145 "* 150 155 160 

Leu Thr Gly Ser Gly Leu Leu Gly Phe Gly Gly Leu Asn Ser Gly Thr 
165 170 175 

Gly Asn He Gly Leu Phe Asn Ser Gly Thr Gly Asn Val Gly He Gly 
180 185 190 

Asn Ser Gly Thr Gly Asn Trp Gly He Gly Asn Ser Gly Asn Ser Tyr 
195 200 205 

Asn Thr Gly Phe Gly Asn Ser Gly Asp Ala Asn Thr Gly Phe Phe Asn 
210 215 220 

Ser Gly He Ala Asn Thr Gly Val Gly Asn Ala Gly Asn Tyr Asn Thr 
225 230 235 240 
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Gly Ser Tyr Asn Pro Gly Asn Ser Asn Thr Gly Gly Phe Asn Met Gly 
245 250 255 

Gin Tyr Asn Thr Gly Tyr Leu Asn Ser Gly Asn Tyr Asn Thr Gly Leu 
260 265 270 

Ala Asn Ser Gly Asn Val Asn Thr Gly Ala Phe He Thr Gly Asn Phe 
275 " 280 285 

Asn Asn Gly Phe Leu Trp Arg Gly Asp His Gin Gly Leu He Phe Gly 
290 295 300 

Ser Pro Gly Phe Phe Asn Ser Thr Ser Ala Pro Ser Ser Gly Phe Phe 
305 310 315 320 

Asn Ser Gly Ala Gly Ser Ala Ser Gly Phe Leu Asn Ser Gly Ala Asn 
325 330 335 

Asn Ser Gly Phe Phe Asn Ser Ser Ser Gly Ala He Gly Asn Ser Gly 
340 345 350 j 

Leu Ala Asn Ala Gly Val Leu Val Ser Gly Val He Asn Ser Gly Asn 
355 360 365 

Thr Val Ser Gly Leu Phe Asn Met Ser Leu Val Ala He Thr Thr Pro 
370 375 380 

Ala Leu He Ser Gly Phe Phe Asn Thr Gly Ser Asn Met Ser Gly Phe 
385 390 395 400 

Phe Gly Gly Pro Pro Val Phe Asn Leu Gly Leu Ala Asn Arg Gly Val 
405 410 415 

Val Asn He Leu Gly Asn Ala Asn He Gly Asn Tyr Asn He Leu Gly 
420 ~ 425 430 

Ser Gly Asn Val Gly Asp Phe Asn He Leu Gly Ser Gly Asn Leu Gly 
435 440 445 

Ser Gin Asn He Leu Gly Ser Gly Asn Val Gly Ser Phe Asn He Gly 
450 455 460 

Ser Gly Asn He Gly Val Phe Asn Val Gly Ser Gly Ser Leu Gly Asn 
465 470 475 480 

Tyr Asn He Gly Ser Gly Asn Leu Gly He Tyr Asn He Gly Phe Gly 
485 490 495 

Asn Val Gly Asp Tyr Asn Val Gly Phe Gly Asn Ala Gly Asp Phe Asn 
500 505 510 

Gin Gly Phe Ala Asn Thr Gly Asn Asn Asn He Gly Phe Ala Asn Thr 
515 520 525 

Gly Asn Asn Asn He Gly He Gly Leu Ser Gly Asp Asn Gin Gin Gly 
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530 535 540 

Phe Asn He Ala Ser Gly Trp Asn Ser Gly Thr Gly Asn Ser Gly Leu 
545 550 555 560 

Phe Asn Ser Gly Thr Asn Asn Val Gly He Phe Asn Ala Gly Thr Gly 
565 570 575 

Asn Val Gly He Ala Asn Ser Gly Thr Gly Asn Trp Gly He Gly Asn 
580 585 590 

Pro Gly Thr Asp Asn Thr Gly He Leu Asn Ala Gly Ser Tyr Asn Thr 
595 600 605 

Gly He Leu Asn Ala Gly Asp Phe Asn Thr Gly Phe Tyr Asn Thr Gly 
610 615 620 

Ser Tyr Asn Thr Gly Gly Phe Asn Val Gly Asn Thr Asn Thr Gly Asn 

fi-^n 635 / 640 



625 



Phe Asn Val Gly Asp Thr Asn Thr Gly Ser Tyr Asn Pro Gly Asp Thr 
645 650 655 

Asn Thr Gly Phe Phe Asn Pro Gly Asn Val Asn Thr Gly Ala Phe Asp 
660 665 670 

Thr Gly Asp Phe Asn Asn Gly Phe Leu Val Ala Gly Asp Asn Gin Gly 
675 680 685 

Gin He Ala He Asp Leu Ser Val Thr Thr Pro Phe He Pro He Asn 
690 695 700 

Glu Gin Met Val He Asp Val His Asn Val Met Thr Phe Gly Gly Asn 
705 710 715 720 

Met He Thr Val Thr Glu Ala Ser Thr Val Phe Pro Gin Thr Phe Tyr 
725 730 735 

Leu Ser Gly Leu Phe Phe Phe Gly Pro Val Asn Leu Ser Ala Ser Thr 
740 745 750 

Leu Thr Val Pro Thr He Thr Leu Thr He Gly Gly Pro Thr Val Thr 
755 760 765 

Val Pro He Ser He Val Gly Ala Leu Glu Ser Arg Thr He Thr Phe 
770 775 780 

Leu Lys He Asp Pro Ala Pro Gly He Gly Asn Ser Thr Thr Asn Pro 
785 . 790 795 800 

Ser Ser Gly Phe Phe Asn Ser Gly Thr Gly Gly Thr Ser Gly Phe Gin 
805 810 815 

Asn Val Gly Gly Gly Ser Ser Gly Val Trp Asn Ser Gly Leu Ser Ser 
820 825 830 



WO 98/16645 



PCTAJS97/18214 

216 



Ala He Gly Asn Ser Gly Phe Gin Asn Leu Gly Ser Leu Gin Ser Gly 
835 840 845 

Trp Ala Asn Leu Gly Asn Ser Val Ser Gly Phe Phe Asn Thr Ser Thr 
850 855 860 

Val Asn Leu Ser Thr Pro Ala Asn Val Ser Gly Leu Asn Asn He Gly 
865 870 875 880 

Thr Asn Leu Ser Gly Val Phe Arg Gly Pro Thr Gly Thr He Phe Asn 
885 890 895 

Ala Gly Leu Ala Asn Leu Gly Gin Leu Asn He Gly Ser Ala Ser Cys 
900 905 910 

Arg He Arg His Glu Leu Asp Thr Val Ser Thr He He Ser Ala Phe 
915 920 925 

Cys Gly Ser Ala Ser Asp Glu Ser Asn Pro Gly Ser Val Ser Glu 
930 935 940 

(2) INFORMATION FOR SEQ ID NO: 200: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 53 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 200: 
GGATCCATAT GGGCCATCAT CATCATCATC ACGTGATCGA CATCATCGGG ACC 
(2) INFORMATION FOR SEQ ID NO: 201: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4 2 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 201: 
CCTGAATTCA GGCCTCGGTT GCGCCGGCCT CATCTTGAAC GA 
(2) INFORMATION FOR SEQ ID NO: 202: 



42 
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(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID ? ~':202: 
GGATCCTGCA GGCTCGAAAC CACCGAGCGG T 
(2) INFORMATION FOR SEQ ID NO: 203: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 31 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 203: 
CTCTGAATTC AGCGCTGGAA ATCGTCGCGA T 
(2) INFORMATION FOR SEQ ID NO: 204: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 33 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 204: 
GGATCCAGCG CTGAGATGAA GACCGATGCC GCT 
(2) INFORMATION FOR SEQ ID NO: 205: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 38 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 205: 
GGATATCTGC AGAATTCAGG TTTAAAGCCC ATTTGCGA 38 
(2) INFORMATION FOR SEQ ID NO: 206: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 30 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:206: 
CCGCATGCGA GCCACGTGCC CACAACGGCC 30 
(2) INFORMATION FOR SEQ ID NO: 207: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 37 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 207: 
CTTCATGGAA TTCTCAGGCC GGTAAGGTCC GCTGCGG 37 
(2) INFORMATION FOR SEQ ID NO: 208: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7676 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 208: 
TGGCGAATGG GACGCGCCCT GTAGCGGCGC ATTAAGCGCG GCGGGTGTGG TGGTTACGCG 60 
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nrTACACTTG 


CCAGCGCCCT 


AGCGCCCGCT 


CCTTTCGCTT TCTTCCCTTC 


120 


CTTTC1 LCjUV-. 


APP.TTPGCCG 


GCTTTCCCCG 


TCAAGCTCTA AATCGGGGGC TCCCTTTAGG 


180 


GTTCCGAl 1 1 


A^TGPTTTAC 


GGCACCTCGA 


CCCCAAAAAA 


CTTGATTAGG GTGATGGTTC 


240 


ACGTAGTGGG 




GATAGACGGT 


TTTTCGCCCT 


TTGACGTTGG AGTCCACGTT 


300 


CTTTAATAGT 


r'pzvpTPTTPT 


TPPAAACTGG 


AACAACACTC 


AACCCTATCT CGGTCTATTC 


360 


TTTTGATTTA 


±I\t\{D\3\3r\ 1 1 X 


TP,PPPiATTTC 


GGCCTATTGG 


TTAAAAAATG AGCTGATTTA 


420 


AC AAAAAT T T 




TTAAPAAAAT 


ATTAACGTTT 


ACAATTTCAG GTGGCACTTT 


480 


TCGGGGAAAT 




PPPPT AT TTf^ 

^U^Ul Al 1 AO 


TTTATTTTTC 


T AAAT AC AT T CAAATATGTA 


540 


TCCGCTCATG 


•n •« rp rp tv tv rp rp rp 

AATTAA1 1L1 


TAPAAAAAPT 


CATCGAGCAT 


CAAATGAAAC TGCAATTTAT 


600 


TCATATCAGG 


ATTATCAATA 


LLHlrtl 1111 


GAAAAAGCCG 


TTTCTGTAAT GAAGGAGAAA 


660 


ACTCACCGAG 


GCAGTTCCAT 


nr"*p atppp a A 


GATCCTGGTA 


TCGGTCTGCG ATTCCGACTC 


720 


GTCCAACATC 


AATACAACCT 


7\ rprp 71 t\ qxT'TPP 
/A 1 1 1 i^^ 


CCTCGTCAAA 


AATAAGGTTA TCAAGTGAGA 


780 


AATCACCATG 


AGTGACGACT 




AGAATGGCAA 


AAGTTTATGC ATTTCTTTCC 


840 


AGACTTGTTC 


AACAGGCCAG 


PP A TT 7X PPPT 


CG T CATC AAA ATCACTCGCA TCAACCAAAC 


900 


CGTTATTCAT 


TCb x CjA 1 1 


PPPTPAPPP.A 


GACGAAATAC 


GCGATCGCTG TTAAAAGGAC 


960 


AATTACAAAC 


7\ /■*• f~* 7\ 7\ r ror > 7\ a 


TPPAAPPGGC 


GCAGGAACAC 


TGCCAGCGCA TCAACAATAT 


1020 


TTTCACCTGA 


nTPZ\P.P ATAT 


TCTTCTAATA 


CCTGGAATGC 


TGTTTTCCCG GGGATCGCAG 


1080 


TGGTGAGTAA 


rcTXTrzc atp A 

Uvn X \3 A v-n 


TCAGGAGTAC 


GGATAAAATG 


CTTGATGGTC GGAAGAGGCA 


1140 


TAAATTCCGT 


P A PPP APTTT 


AGTCTGACCA 


TCTCATCTGT 


AACATCATTG GCAACGCTAC 


1200 


CTTTGCCATG 


TTTPAPAAAP 


AACTCTGGCG 


CATCGGGCTT 


CCCATACAAT CGATAGATTG 


1260 


TCGCACCTGA 


f TPPPPP APA 


TTATCGCGAG 


CCCATTTATA 


CCCATATAAA TCAGCATCCA 


1320 


TGTTGGAATT 


TAATCGCGGC 


CTAGAGCAAG 


ACGTTTCCCG 


TTGAATATGG CTCATAACAC 


1380 


CCCTTGTATT 


ACTGTTTATG 


TAAGCAGACA 


GTTTTATTGT 


TCATGACCAA AATCCCTTAA 


1440 


CGTGAGTTTT 


CGTTCCACTG 


AGCGTCAGAC 


CCCGTAGAAA AGATCAAAGG ATCTTCTTGA 


1500 


GATCCTTTTT 


TTCTGCGCGT 


AATCTGCTGC 


TTGCAAACAA 


AAAAACCACC GCTACCAGCG 


1560 


GTGGTTTGTT 


TGCCGGATCA 


AGAGCTACCA ACTCTTTTTC 


CGAAGGTAAC TGGCTTCAGC 


1620 


AGAGCGCAGA 


TACCAAATAC 


TGTCCTTCTA 


GTGTAGCCGT 


AGTTAGGCCA CCACTTCAAG 


1680 
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AACTCTGTAG 


CACCGCCTAC 


ATACCTCGCT 


CTGCTAATCC 


TGTTACCAGT ■ 


GGCTGCTGCC 


1740 


AGTGGCGATA 


AGTCGTGTCT 


TACCGGGTTG 


GACTCAAGAC 


GATAGTTACC 


GGATAAGGCG 


1800 


CAGCGGTCGG 


GCTGAACGGG 


GGGTTCGTGC 


ACACAGCCCA 


GCTTGGAGCG 


AACGACCTAC 


1860 


ACCGAACTGA 


GATACCTACA 


GCGTGAGCTA 


TGAGAAAGCG 


CCACGCTTCC 


CGAAGGGAGA 


1920 


AAGGCGGACA 


GGTATCCGGT 


AAGCGGCAGG 


GTCGGAACAG 


GAG AG C G C AC 


GAGGGAGCTT 


1980 


CCAGGGGGAA 


ACGCCTGGTA 


TCTTTATAGT 


CCTGTCGGGT 


TTCGCCACCT 


CTGACTTGAG 


2040 


CGTCGATTTT 


TGTGATGCTC 


GTCAGGGGGG 


CGGAGCCTAT 


GGAAAAACGC 


CAGCAACGCG 


2100 


GCCTTTTTAC 


GGTTCCTGGC 


CTTTTGCTGG 


CCTTTTGCTC 


ACATGTTCTT 


TCCTGCGTTA 


2160 


TCCCCTGATT 


CTGTGGATAA 


CCGTATTACC 


GCCTTTGAGT 


GAGCT GAT AC 


CGCTCGCCGC 

/ 


2220 


AGCCGAACGA 


CCGAGCGCAG 


CGAGTCAGTG 


AGCGAGGAAG 


CGGAAGAGCG 


CCTGATGCGG 


2280 


TATTTTCTCC 


TTACGCATCT 


GTGCGGTATT 


TCACACCGCA 


TATATGGTGC 


ACTCTCAGTA 


2340 


CAATCTGCTC 


TGATGCCGCA 


TAGTTAAGCC 


AGTATACACT 


CCGCTATCGC 


TACGTGACTG 


2400 


GGTCATGGCT 


GCGCCCCGAC 


ACCCGCCAAC 


ACCCGCTGAC 


GCGCCCTGAC 


GGGCTTGTCT 


2460 


GCTCCCGGCA 


TCCGCTTACA 


GACAAGCTGT 


GACCGTCTCC 


GGGAGCTGCA 


TGTGTCAGAG 


2520 


GTTTTCACCG 


TCATCACCGA 


AACGCGCGAG 


GCAGCTGCGG 


TAAAGCTCAT 


CAGCGTGGTC 


2580 


GTGAAGCGAT 


TCACAGATGT 


CTGCCTGTTC 


ATCCGCGTCC 


AGCTCGTTGA 


GTTTCTCCAG 


2640 


AAGCGTTAAT 


GTCTGGCTTC 


TGATAAAGCG 


GGCCATGTTA 


AGGGCGGTTT 


TTTCCTGTTT 


2700 


GGTCACTGAT 


GCCTCCGTGT 


AAGGGGGATT 


TCTGTTCATG 


GGGGTAATGA 


TACCGATGAA 


2760 


ACGAGAGAGG 


ATGCTCACGA 


TACGGGTTAC 


TGATGATGAA 


CATGCCCGGT 


TACTGGAACG 


2820 


TTGTGAGGGT 


AAACAACTGG 


CGGTATGGAT 


GCGGCGGGAC 


CAGAGAAAAA 


TCACTCAGGG 


2880 


TCAATGCCAG 


CGCTTCGTTA 


ATACAGATGT 


AGGTGTTCCA 


CAGGGTAGCC 


AGCAGCATCC 


2940 


TGCGATGCAG 


ATCCGGAACA 


TAATGGTGCA 


GGGCGCTGAC 


TTCCGCGTTT 


CCAGACTTTA 


3000 


CGAAACACGG 


AAACCGAAGA 


CCATTCATGT 


TGTTGCTCAG 


GTCGCAGACG 


TTTTGCAGCA 


3060 


GCAGTCGCTT 


CACGTTCGCT 


CGCGTATCGG 


TGATTCATTC 


TGCTAACCAG 


TAAGGCAACC 


3120 


CCGCCAGCCT 


AGCCGGGTCC 


TCAACGACAG 


GAGCACGATC 


ATGCGCACCC 


GTGGGGCCGC 


3180 


CATGCCGGCG 


ATAATGGCCT 


GCTTCTCGCC 


GAAACGTTTG 


GTGGCGGGAC 


CAGTGACGAA 


3240 


GGCTTGAGCG 


AGGGCGTGCA 


AGATTCCGAA 


TACCGCAAGC 


GACAGGCCGA 


TCATCGTCGC 


3300 


GCTCCAGCGA 


AAGCGGTCCT 


CGCCGAAAAT 


GACCCAGAGC 


GCTGCCGGCA 


, CCTGTCCTAC 


3360 
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GAGTTGCATG ATAAAGAAGA CAGTCATAAG TGCGGCGACG ATAGTCATGC CCCGCGCCCA 34 20 

CCGGAAGGAG CTGACTGGGT TGAAGGCTCT CAAGGGCATC GGTCGAGATC CCGGTGCCTA 34 80 

ATGAGTGAGC TAACTTACAT TAATTGCGTT GCGCTCACTG CCCGCTTTCC AGTCGGGAAA 354 0 

CCTGTCGTGC CAGCTGCATT AATGAATCGG CCAACGCGCG GGGAGAGGCG GTTTGCGTAT 3600 

TGGGCGCCAG GGTGGTTTTT CTTTTCACCA GTGAGACGGG CAACAGCTGA TTGCCCTTCA 3 660 

CCGCCTGGCC CTGAGAGAGT TGCAGCAAGC GGTCCACGCT GGTTTGCCCC AGCAGGCGAA 3720 

AATCCTGTTT GATGGTGGTT AACGGCGGGA TATAACATGA GCTGTCTTCG GTATCGTCGT 37 8.0 

ATCCCACTAC CG AG AT AT CC GCACCAACGC GCAGCCCGGA CTCGGTAATG GCGCGCATTG 38 4 0 

CGCCCAGCGC CATCTGATCG TTGGCAACCA GCATCGCAGT GGGAACGATG CCCTCATTCA 3900 

GCATTTGCAT GGTTTGTTGA AAACCGGACA TGGCACTCCA GTCGCCTTCC CGTTCCGCTA 3 960 

TCGGCTGAAT TTGATTGCGA GTGAGATATT TATGCCAGCC AGCCAGACGC AGACGCGCCG 4 020 

AGACAGAACT TAATGGGCCC GCTAACAGCG CGATTTGCTG GTGACCCAAT GCGACCAGAT 4 08 0 

GCTCCACGCC CAGTCGCGTA CCGTCTTCAT GGGAGAAAAT AATACTGTTG ATGGGTGTCT 414 0 

GGTCAGAGAC ATCAAGAAAT AACGCCGGAA CATTAGTGCA GGCAGCTTCC ACAGCAATGG 4 200 

CATCCTGGTC ATCCAGCGGA TAGTTAATGA TCAGCCCACT GACGCGTTGC GCGAGAAGAT 4 2 60 

TGTGCACCGC CGCTTTACAG GCTTCGACGC CGCTTCGTTC TACCATCGAC ACCACCACGC 4 320 

TGGCACCCAG TTGATCGGCG CGAGATTTAA TCGCCGCGAC AATTTGCGAC GGCGCGTGCA 4 380 

GGGCCAGACT GGAGGTGGCA ACGCCAATCA GCAACGACTG TTTGCCCGCC AGTTGTTGTG 4 4 40 

CCACGCGGTT GGGAATGTAA TTCAGCTCCG CCATCGCCGC TTCCACTTTT TCCCGCGTTT 4 500 

TCGCAGAAAC GTGGCTGGCC TGGTTCACCA CGCGGGAAAC GGTCTGATAA GAGACACCGG 4 560 

CATACTCTGC GACATCGTAT AACGTTACTG GTTTCACATT CACCACCCTG AATTGACTCT 4 620 

CTTCCGGGCG CTATCATGCC ATACCGCGAA AGGTTTTGCG CCATTCGATG GTGTCCGGGA 4 680 

TCTCGACGCT CTCCCTTATG CGACTCCTGC ATTAGGAAGC AGCCCAGTAG TAGGTTGAGG 474 0 

CCGTTGAGCA CCGCCGCCGC AAGGAATGGT GCATGCAAGG AGATGGCGCC CAACAGTCCC 4 8 00 

CCGGCCACGG GGCCTGCCAC CATACCCACG CCGAAACAAG CGCTCATGAG CCCGAAGTGG 4 8 60 

CGAGCCCGAT CTTCCCCATC GGTGATGTCG GCGATATAGG CGCCAGCAAC CGCACCTGTG 4 920 

GCGCCGGTGA TGCCGGCCAC GATGCGTCCG GCGTAGAGGA TCGAGATCTC GATCCCGCGA 4 98 0 
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AATTAATACG ACTCACTATA GGGGAATTGT GAGCGGATAA CAATTCCCCT CTAGAAATAA 504 0 

TTTTGTTTAA CTTTAAGAAG GAGATATACA TATGGGCCAT CATCATCATC ATCACGTGAT 5100 

CGACATCATC GGGACCAGCC CCACATCCTG GGAACAGGCG GCGGCGGAGG CGGTCCAGCG 5160 

GGCGCGGGAT AGCGTCGATG ACATCCGCGT CGCTCGGGTC ATTGAGCAGG ACATGGCCGT 5220 

GGACAGCGCC GGCAAGATCA CCTACCGCAT CAAGCTCGAA GTGTCGTTCA AGATGAGGCC 5280 

GGCGCAACCG AGGGGCTCGA AACCACCGAG CGGTTCGCCT GAAACGGGCG CCGGCGCCGG 534 0 

TACTGTCGCG ACTACCCCCG CGTCGTCGCC GGTGACGTTG GCGGAGACCG GTAGCACGCT 54 00 

GCTCTACCCG CTGTTCAACC TGTGGGGTCC GGCCTTTCAC GAGAGGTATC CGAACGTCAC 54 60 

GATCACCGCT CAGGGCACCG GTTCTGGTGC CGGGATCGCG CAGGCCGCCG CCGGGACGGT 5520 

CAACATTGGG GCCTCCGACG CCTATCTGTC GGAAGGTGAT ATGGCCGCGC ACAAGGGGCT 55 8 0 

GATGAACATC GCGCTAGCCA TCTCCGCTCA GCAGGTCAAC TACAACCTGC CCGGAGTGAG 564 0 

CGAGCACCTC AAGCTGAACG GAAAAGTCCT GGCGGCCATG TACCAGGGCA CCATCAAAAC 5700 

CTGGGACGAC CCGCAGATCG CTGCGCTCAA CCCCGGCGTG AACCTGCCCG GCACCGCGGT 57 60 

AGTTCCGCTG CACCGCTCCG ACGGGTCCGG TGACACCTTC TTGTTCACCC AGTACCTGTC 5820 

CAAGCAAGAT CCCGAGGGCT GGGGCAAGTC GCCCGGCTTC GGCACCACCG TCGACTTCCC 588 0 

GGCGGTGCCG GGTGCGCTGG GTGAGAACGG CAACGGCGGC ATGGTGACCG GTTGCGCCGA 594 0 

GACACCGGGC TGCGTGGCCT ATATCGGCAT CAGCTTCCTC GACCAGGCCA GTCAACGGGG 6000 

ACTCGGCGAG GCCCAACTAG GCAATAGCTC TGGCAATTTC TTGTTGCCCG ACGCGCAAAG 6060 

CATTCAGGCC GCGGCGGCTG GCTTCGCATC GAAAACCCCG GCGAACCAGG CGATTTCGAT 612 0 

GATCGACGGG CCCGCCCCGG ACGGCTACCC GATCATCAAC TACGAGTACG CCATCGTCAA 6180 

CAACCGGCAA AAGGACGCCG CCACCGCGCA GACCTTGCAG GCATTTCTGC ACTGGGCGAT 624 0 

CACCGACGGC AACAAGGC C T CGTTCCTCGA CCAGGTTCAT TTCCAGCCGC TGCCGCCCGC 6300 

GGTGGTGAAG TTGTCTGACG CGTTGATCGC GACGATTTCC AGCGCTGAGA TGAAGACCGA 6360 

TGCCGCTACC CTCGCGCAGG AGGCAGGTAA TTTCGAGCGG ATCTCCGGCG ACCTGAAAAC 64 20 

CCAGATCGAC CAGGTGGAGT CGACGGCAGG TTCGTTGCAG GGCCAGTGGC GCGGCGCGGC 64 80 

GGGGACGGCC GCCCAGGCCG CGGTGGTGCG CTTCCAAGAA GCAGCCAATA AG C AG AAG C A 654 0 

GGAACTCGAC GAGATCTCGA CGAATATTCG TCAGGCCGGC GTCCAATACT CGAGGGCCGA 6600 

CGAGGAGCAG CAGCAGGCGC TGTCCTCGCA AATGGGCTTT GTGCCCACAA CGGCCGCCTC 6660 
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nrCGCCGTCG 


ACCGCTGCAG 


CGCCACCCGC 


ACCGGCGACA 


CCTGTTGCCC 


CCCCACCACC 


6720 


GGCCGCCGCC 


AACACGCCGA 


ATGCCCAGCC 


GGGCGATCCC 


AACGCAGCAC 


CTCCGCCGGC 


6780 


cczacccg aac 


GCACCGCCGC 


CACCTGTCAT 


TGCCCCAAAC 


GCACCCCAAC 


CTGTCCGGAT 


6840 


pnAPAAPCCG 


GTTGGAGGAT 


TCAGCTTCGC 


GCTGCCTGCT 


GGCTGGGTGG 


AGTCTGACGC 


6900 




GACTACGGTT 


CAGCACTCCT 


CAGCAAAACC 


ACCGGGGACC 


CGCCATTTCC 


6960 




CCGCCGGTGG 


CCAATGACAC 


CCGTATCGTG 


CTCGGCCGGC 


TAG AC C AAAA 


7020 




AGCGCCGAAG 


CCACCGACTC 


CAAGGCCGCG 


GCCCGGTTGG 


GCTCGGACAT 


7080 




TATATGCCCT 


ACCCGGGCAC 


CCGGATCAAC 


CAGGAAACCG 


TCTCGCTTGA 


7140 


C G C C M L, La U 


r^TCJTCTGGAA 


GCGCGTCGTA 


TTACGAAGTC 


AAGTTCAGCG 


ATCCGAGTAA 


7200 


/"^ /^r*/^ TV T\ «^ f~* C 


rA^ATCTGGA 


CGGGCGTAAT 


CGGCTCGCCC 


GCGGCGAACG 


CACCGGACGC 


7260 


CGGGLULtL 1 


nAGCGCTGGT 


TTGTGGTATG 


GCTCGGGACC 


GCCAACAACC 


CGGTGGACAA 


7320 


GGGCGCGGCC 


AAGGCGCTGG 


CCGAA1 CGA 1 


L,UvjLjCL> 1 1 Ivj 


GTCGCCCCGC 


CGCCGGCGCC 


7380 


GGCACCGGCT 


CCTGCAGAGC 


CCGCTCCGGC 


GCCGGCGCCG 


GCCGGGGAAG 


TCGCTCCTAC 


7440 


CCCGACGACA 


CCGACACCGC 


AGCGGACCTT 


ACCGGCCTGA 


GAATTCTGCA 


GAT AT CC AT C 


7500 


ACACTGGCGG 


CCGCTCGAGC 


ACCACCACCA 


CCACCACTGA 


GATCCGGCTG 


CTAACAAAGC 


7560 


CCGAAAGGAA 


GCTGAGTTGG 


CTGCTGCCAC 


CGCTGAGCAA 


TAACTAGCAT 


AACCCCTTGG 


7620 


GGCCTCTAAA 


CGGGTCTTGA 


GGGGTTTTTT 


GCTGAAAGGA 


GGAACTATAT 


CCGGAT 


7676 



(2) INFORMATION FOR SEQ ID NO: 209: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 802 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 20 9: 

Met Gly His His His His His His Val He Asp He He Gly Thr Ser 
! 5 10 15 

Pro Thr Ser Trp Glu Gin Ala Ala Ala Glu Ala Val Gin Arg Ala Arg 
20 25 30 
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Asp Ser Val Asp Asp He Arg Val Ala Arg Val He Glu Gin Asp Met 
35 40 45 

Ala Val Asp Ser Ala Gly Lys He Thr Tyr Arg He Lys Leu Glu Val 
50 55 60 

Ser Phe Lys Met Arg Pro Ala Gin Pro Arg Gly Ser Lys Pro Pro Ser 
65 70 75 80 

Gly Ser Pro Glu Thr Gly Ala Gly Ala Gly Thr Val Ala Thr Thr Pro 
85 90 95 

Ala Ser Ser Pro Val Thr Leu Ala Glu Thr Gly Ser Thr Leu Leu Tyr 
100 105 HO 

Pro Leu Phe Asn Leu Trp Gly Pro Ala Phe His Glu Arg Tyr Pro Asn 
115 120 125 

/ 

Val Thr He Thr Ala Gin Gly Thr Gly Ser Gly Ala Gly He Ala Gin 
130 135 140 / 

Ala Ala Ala Gly Thr Val Asn He Gly Ala Ser Asp Ala Tyr Leu Ser 
145 150 155 160 

Glu Gly Asp Met Ala Ala His Lys Gly Leu Met Asn He Ala Leu Ala 
165 17 0 175 

He Ser Ala Gin Gin Val Asn Tyr Asn Leu Pro Gly Val Ser Glu His 
180 185 190 

Leu Lys Leu Asn Gly Lys Val Leu Ala Ala Met Tyr Gin Gly Thr He 
195 200 205 

Lys Thr Trp Asp Asp Pro Gin He Ala Ala Leu Asn Pro Gly Val Asn 
210 215 220 

Leu Pro Gly Thr Ala Val Val Pro Leu His Arg Ser Asp Gly Ser Gly 
225 230 235 240 

Asp Thr Phe Leu Phe Thr Gin Tyr Leu Ser Lys Gin Asp Pro Glu Gly 
245 250 255 

Trp Gly Lys Ser Pro Gly Phe Gly Thr Thr Val Asp Phe Pro Ala Val 
260 265 270 

Pro Gly Ala Leu Gly Glu Asn Gly Asn Gly Gly Met Val Thr Gly Cys 
275 280 285 

Ala Glu Thr Pro Gly Cys Val Ala Tyr He Gly He Ser Phe Leu Asp 
290 295 300 

Gin Ala Ser Gin Arg Gly Leu Gly Glu Ala Gin Leu Gly Asn Ser Ser 
305 310 315 320 

Gly Asn Phe Leu Leu Pro Asp Ala Gin Ser lie Gin Ala Ala Ala Ala 
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325 330 335 

Glv Phe Ala Ser Lys Thr Pro Ala Asn Gin Ala He Ser Met He Asp 
340 345 350 

Gly Pro Ala Pro Asp Gly Tyr Pro He He Asn Tyr Glu Tyr Ala He 
355 360 365 

Val Asn Asn Arg Gin Lys Asp Ala Ala Thr Ala Gin Thr Leu Gin Ala 
370 ' 375 380 

Phe Leu His Trp Ala He Thr Asp Gly Asn Lys Ala Ser Phe Leu Asp 
385 390 395 400 

Gin Val His Phe Gin Pro Leu Pro Pro Ala Val Val Lys Leu Ser Asp 
405 410 415 

/ 

Ala Leu He Ala Thr He Ser Ser Ala Glu Met Lys Thr Asp Al/a Ala 
420 425 430 / 

Thr Leu Ala Gin Glu Ala Gly Asn Phe Glu Arg He Ser Gly Asp Leu 
435 440 445 

Lys Thr Gin lie Asp Gin Val Glu Ser Thr Ala Gly Ser Leu Gin Gly 
450 455 460 

Gin Trp Arg Gly Ala Ala Gly Thr Ala Ala Gin Ala Ala Val Val Arg 
465 ' ' "70 475 480 

Phe Gin Glu Ala Ala Asn Lys Gin Lys Gin Glu Leu Asp Glu He Ser 
485 490 495 

Thr Asn He Arg Gin Ala Gly Val Gin Tyr Ser Arg Ala Asp Glu Glu 
500 505 510 

Gin Gin Gin Ala Leu Ser Ser Gin Met Gly Phe Val Pro Thr Thr Ala 
515 520 525 

Ala Ser Pro Pro Ser Thr Ala Ala Ala Pro Pro Ala Pro Ala Thr Pro 
530 535 540 

Val Ala Pro Pro Pro Pro Ala Ala Ala Asn Thr Pro Asn Ala Gin Pro 
545 550 555 560 

Glv Asp Pro Asn Ala Ala Pro Pro Pro Ala Asp Pro Asn Ala Pro Pro 
565 570 575 

Pro Pro Val He Ala Pro Asn Ala Pro Gin Pro Val Arg He Asp Asn 
580 585 . 590 

Pro Val Gly Gly Phe Ser Phe Ala. Leu Pro Ala Gly Trp Val Glu Ser 
595 600 605 

Asd Ala Ala His Phe Asp Tyr Gly Ser Ala Leu Leu Ser Lys Thr Thr 
610 615 620 
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Gly Asp Pro Pro Phe Pro Gly Gin Pro Pro Pro Val Ala Asn Asp Thr 
625 " 630 635 64 0 

Arg lie Val Leu Gly Arg Leu Asp Gin Lys Leu Tyr Ala Ser Ala Glu 
645 650 655 

Ala Thr Asp Ser Lys Ala Ala Ala Arg Leu Gly Ser Asp Met Gly Glu 
660 665 670 

Phe Tyr Met Pro Tyr Pro Gly Thr Arg lie Asn Gin Glu Thr Val Ser 
675 680 685 

Leu Asp Ala Asn Gly Val Ser Gly Ser Ala Ser Tyr Tyr Glu Val Lys 
690 695 700 

Phe Ser Asp Pro Ser Lys Pro Asn Gly Gin lie Trp Thr Gly Val lie 
705 710 715 . 720 

Gly Ser Pro Ala Ala Asn Ala Pro Asp Ala Gly Pro Pro Gin Arg Trp 
725 730 735 

Phe Val Val Trp Leu Gly Thr Ala Asn Asn Pro Val Asp Lys Gly Ala 
740 745 750 

Ala Lys Ala Leu Ala Glu Ser lie Arg Pro Leu Val Ala Pro Pro Pro 
755 760 765 

Ala Pro Ala Pro Ala Pro Ala Glu Pro Ala Pro Ala Pro Ala Pro Ala 
770 775 780 

Gly Glu Val Ala Pro Thr Pro Thr Thr Pro Thr Pro Gin Arg Thr Leu 
785 790 795 800 



Pro Ala 
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CLAIMS 

We claim: 

1. A polypeptide comprising an antigenic portion of a soluble 
M. tuberculosis antigen, or a variant of said antigen that differs only in conservative 
substitutions and/or modifications, wherein said antigen has an N-terminal sequence selected 

from the group consisting of: 

(a) Asp-Pro- Val-Asp-Ala- Val-Ile-Asn-Thr-Thr-Cys-Asn-Tyr-Gly-Gln- 

Val-Val- Ala-Ala-Leu (SEQ ID NO: 115); 

(b) Ala-Val-Glu-Ser-Gly-Met-Leu-Ala-Leu-Gly-Thr-Pro-Ala-Pro-Ser 

(SEQ ID NO: 116); / 

(c) Ala-Ala-Met-Lys-Pro-Arg-Thr-Gly-Asp-Gly-Pro-Leu-Giu-Ala-Ala- 

Lys-Glu-Gly-Arg (SEQ ID NO: 17); 

(d) Tyr-Tyr-Trp-Cys-Pro-Gly-Gln-Pro-Phe-Asp-Pro-Ala-Trp-Gly-Pro 

(SEQ ID NO: 118); 

(e) Asp-Ile-Gly-Ser-Glu-Ser-Thr-Glu-Asp-Gln-Gln-Xaa-Ala-Val (SEQ ID 
NO: 119); 

(f) Ala-Glu-Glu-Ser-Ile-Ser-Thr-Xaa-Glu-Xaa-Ile-Val-Pro (SEQ ID 
NO: 120); 

(g) Asp-Pro-Glu-Pro-Ala-Pro-Pro-Val-Pro-Thr-Thr-Ala-Ala-Ser-Pro-Pro- 

Ser(SEQ ID NO: 121); 

(h) Ala-Pro-Lys-Thr-Tyr-Xaa-Glu-Glu-Leu-Lys-Gly-Thr-Asp-Thr-Gly 

(SEQ ID NO: 122); 

(i) Asp-Pro-Ala-Ser-Ala-Pro-Asp-Val-Pro-Thr-Ala-Ala-Gln-Leu-Thr-Ser- 
Leu-Leu-Asn-Ser-Leu-Ala-Asp-Pro-Asn-Val-Ser-Phe-Ala-Asn (SEQ 
ID NO: 123); and 

(j) Ala-Pro-Glu-Ser-Gly-Ala-Gly-Leu-Gly-Gly-Thr-Val-Gln-Ala-Gly; 
(SEQ ID NO: 131) 
wherein Xaa may be any amino acid. 
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2. A polypeptide comprising an immunogenic portion of an 
M tuberculosis antigen, or a variant of said antigen that differs only in conservative 
substitutions and/or modifications, wherein said antigen has an N-terminal sequence selected 
from the group consisting of: 

(a) Asp-Pro-Pro-Asp-Pro-His-Gln-Xaa-Asp-Met-Thr-Lys-Gly-Tyr-Tyr- 
Pro-Gly-Gly-Arg-Arg-Xaa-Phe; (SEQ ID NO: 124) and 

(b) Xaa-Tyr-Ile-Ala-Tyr<Xaa-Thr-Thr-Ala-Gly-Ile-Val-Pro-Gly-Lys-Ile- 
Asn-Val-His-Leu-Val; (SEQ ID NO: 132), wherein Xaa may be any 
amino acid. 

3. A polypeptide comprising an antigenic portion ^of a soluble 
M. tuberculosis antigen, or a variant of said antigen that differs only in conservative 
substitutions and/or modifications, wherein said antigen comprises an amino acid sequence 
encoded by a DNA sequence selected from the group consisting of the sequences recited in 
SEQ ID NOS: 1, 2, 4-10, 13-25, 52, 94 and 96, the complements of said sequences, and DNA 
sequences that hybridize to a sequence recited in SEQ ID NOS: 1, 2, 4-10, 13-25, 52, 94 and 
96 or a complement thereof under moderately stringent conditions. 

4. A polypeptide comprising an antigenic portion of a M tuberculosis 
antigen, or a variant of said antigen that differs only in conservative substitutions and/or 
modifications, wherein said antigen comprises an amino acid sequence encoded by a DNA 
sequence selected from the group consisting of the sequences recited in SEQ ID NOS: 26-51, 
133, 134, 158-178 and 196, the complements of said sequences, and DNA sequences that 
hybridize to a sequence recited in SEQ ID NOS: 26-51, 133, 134, 158-178 and 196 or a 
complement thereof under moderately stringent conditions. 

5. A DNA molecule comprising a nucleotide sequence encoding a 
polypeptide according to any one of claims 1-4. 
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6. A recombinant expression vector comprising a DNA molecule 
according to claim 5. 

7. A host cell transformed with an expression vector according to claim 6. 

8. The host cell of claim 7 wherein the host cell is selected from the group 
consisting of E. coli, yeast and mammalian cells. 

9. A method for detecting M. tuberculosis infection in a biological 

sample, comprising: 

(a) contacting a biological sample with one or more polypeptides 

according to any of claims 1-4; and 

(b) detecting in the sample the presence of antibodies that bind to at least 
one of the polypeptides, thereby detecting M tuberculosis infection in the biological sample. 

10. A method for detecting M tuberculosis infection in a biological 

sample, comprising: 

(a) contacting a biological sample with a polypeptide having an N- 
terminal sequence selected from the group consisting of sequences provided in SEQ ID NO: 
129 and 130; and 

(b) detecting in the sample the presence of antibodies that bind to at least 
one of the polypeptides, thereby detecting M. tuberculosis infection in the biological sample. 

11. A method for detecting M tuberculosis infection in a biological 
sample, comprising: 

(a) contacting a biological sample with one or more polypeptides encoded 
by a DNA sequence selected from the group consisting of SEQ ID NOS: 3, 11, 12, 135, 136, 
151-155, 184-188, 194-195 and 198, the complements of said sequences, and DNA sequences 
that hybridize to a sequence recited in SEQ ID NOS: 3, 11, 12, 135, 136, 151-155, 184-188, 
194-195 and 198; and 
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(b) detecting in the sample the presence of antibodies that bind to at least 
one of the polypeptides, thereby detecting M tuberculosis infection in the biological sample. 

12. The method of any one of claims 9-11 wherein step (a) additionally 
comprises contacting the biological sample with a 38 kD Af. tuberculosis antigen and step (b) 
additionally comprises detecting in the sample the presence of antibodies that bind to the 
38 kD M. tuberculosis antigen. 

13. The method of any one of claims 9-1 1 wherein the polypeptide(s) are 
bound to a solid support. 

/ 

/ 

14. The method of claim 13 wherein the solid support comprises 
nitrocellulose, latex or a plastic material. 

15. The method of any one of claims 9-1 1 wherein the biological sample is 
selected from the group consisting of whole blood, serum, plasma, saliva, cerebrospinal fluid 
and urine. 

16. The method of claim 15 wherein the biological sample is whole blood 

or serum. 

17. A method for detecting M. tuberculosis infection in a biological 

sample, comprising: 

(a) contacting the sample with at least two oligonucleotide primers in a 
polymerase chain reaction, wherein at least one of the oligonucleotide primers is specific for a 
DNA molecule according to claim 5; and 

(b) detecting in the sample a DNA sequence that amplifies in the presence 
of the oligonucleotide primers, thereby detecting M. tuberculosis infection. 
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18. The method of claim 17, wherein at least one of the oligonucleotide 
primers comprises at least about 10 contiguous nucleotides of a DNA molecule according to 
claim 5. 

19. A method for detecting At. tuberculosis infection in a biological 

sample, comprising: 

(a) contacting the sample with at least two oligonucleotide primers in a 
polymerase chain reaction, wherein at least one of the oligonucleotide primers is specific for a 
DNA sequence selected from the group consisting of SEQ ID NOS: 3, 11, 12, 135, 136, 151- 

155, 184-188, 194-195 and 198; and / 

(b) detecting in the sample a DNA sequence that amplifies' in the presence 
of the first and second oligonucleotide primers, thereby detecting At. tuberculosis infection. 

20. The method of claim 19, wherein at least one of the oligonucleotide 
primers comprises at least about 10 contiguous nucleotides of a DNA sequence selected from 
the group consisting of SEQ ID NOS: 3, 11, 12, 135, 136, 151-155, 184-188, 194-195 and 
198. 

21. The method of claims 17 or 19 wherein the biological sample is 
selected from the group consisting of whole blood, sputum, serum, plasma, saliva, 
cerebrospinal fluid and urine. 

22. A method for detecting At. tuberculosis infection in a biological 
sample, comprising: 

(a) contacting the sample with one or more oligonucleotide probes specific 
for a DNA molecule according to claim 5; and 

(b) detecting in the sample a DNA sequence that hybridizes to the 
oligonucleotide probe, thereby detecting At. tuberculosis infection. 
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23. The method of claim 22 wherein the probe comprises at least about 15 
contiguous nucleotides of a DNA molecule according to claim 5. 

24. A method for detecting M tuberculosis infection in a biological 
sample, comprising: 

(a) contacting the sample with one or more oligonucleotide probes specific 
for a DNA sequence selected from the group consisting of SEQ ID NOS: 3, 1 1, 12, 135, 136, 
151-155, 184-188, 194-195 and 198; and 

(b) detecting in the sample a DNA sequence that hybridizes to the 
oligonucleotide probe, thereby detecting M. tuberculosis infection. 

25. The method of claim 24 wherein the oligonucleotide probe comprises 
at least about 15 contiguous nucleotides of a DNA sequence selected from the group 
consisting of SEQ ID NOS: 3, 11, 12, 135, 136, 151-155, 184-188, 194-195 and 198. 

26. The method of claims 22 or 24 wherein the biological sample is 
selected from the group consisting of whole blood, sputum, serum, plasma, saliva, 
cerebrospinal fluid and urine. 

27. A method for detecting M. tuberculosis infection in a biological 

sample, comprising: 

(a) contacting the biological sample with a binding agent which is capable 
of binding to a polypeptide according to any one of claims 1-4; and 

(b) detecting in the sample a protein or polypeptide that binds to the 
binding agent, thereby detecting M tuberculosis infection in the biological sample. 

28. A method for detecting M. tuberculosis infection in a biological 
sample, comprising: 
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(a) contacting the biological sample with a binding agent which is capable 
of binding to a polypeptide having an N-terminal sequence selected from the group consisting 
of sequences provided in SEQ ID NO: 129 and 130; and 

(b) detecting in the sample a protein or polypeptide that binds to the 
binding agent, thereby detecting M. tuberculosis infection in the biological sample. 

29. A method for detecting M. tuberculosis infection in a biological 

sample, comprising: 

(a) contacting the biological sample with a binding agent which is capable 
of binding to a polypeptide encoded by a DNA sequence selected from the group consisting 
of SEQ ID NOS: 3, 11, 12, 135, 136, 151-155, 184-188, 194-195 and 198, the complements 
of said sequences, and DNA sequences that hybridize to a sequence recited in SEQ ID 
NOS: 3, 11, 12, 135, 136, 151-155, 184-188, 194-195 and 198; and 

(b) detecting, in . the sample a protein or polypeptide that binds to the 
binding agent, thereby detecting M. tuberculosis infection in the biological sample. 

30. The method of any one of claims 27-29 wherein the binding agent is a 
monoclonal antibody. 

3 1 . The method of any one of claims 27-29 wherein the binding agent is a 
polyclonal antibody. 

32. A diagnostic kit comprising: 

(a) one or more polypeptides according to any of claims 1 -4; and 

(b) a detection reagent. 

A diagnostic kit comprising: 

one or more polypeptides having an N-terminal sequence selected from 
of sequences provided in SEQ ID NO: 129 and 130; and 
a detection reagent. 



33. 
(a) 

the group consisting 
(b) 
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34 . A diagnostic kit comprising: 

(a) one or more polypeptides encoded by a DNA sequence selected from 
the group consisting of SEQ ID NOS: 3, 11, 12, 135, 136, 151-155, 184-188, 194-195 and 
198, the complements of said sequences, and DNA sequences that hybridize to a sequence 
recited in SEQ ID NOS: 3, 11, 12, 135, 136, 151-155, 184-188, 194-195 and 198; and 

(b) a detection reagent. 

35. The kit of any one of claims 32-34 wherein the polypeptide(s) are 
immobilized on a solid support. , 

i 

36. The kit of claim 35 wherein the solid support comprises nitrocellulose, 
latex or a plastic material. 

37. The kit of any one of claims 32-34 wherein the detection reagent 
comprises a reporter group conjugated to a binding agent 

38. The kit of claim 37 wherein the binding agent is selected from the 
group consisting of anti-immunoglobulins, Protein G, Protein A and lectins. 

39. The kit of claim 37 wherein the reporter group is selected from the 
group consisting of radioisotopes, fluorescent groups, luminescent groups, enzymes, biotin 
and dye particles. 

40. A diagnostic kit comprising at least two oligonucleotide primers, at 
least one of the oligonucleotide primers being specific for a DNA molecule according to 
claim 5. 
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41. A diagnostic kit according to claim 40, wherein at least one of the 
oligonucleotide primers comprises at least about 10 contiguous nucleotide of a DNA 
molecule according to claim 5. 

42. A diagnostic kit comprising a at least two oligonucleotide primers, at 
least one of the primers being specific for a DNA sequence selected from the group consisting 
of SEQIDNOS: 3, 11, 12, 135, 136, 151-155, 184-188, 194-195 and 198. 

43. A diagnostic kit according to claim 42, wherein at least one of the 
oligonucleotide primers comprises at least about 10 contiguous nucleotide of a DNA 
sequence selected from the group consisting of SEQ ID NOS: 3, 11, 12, 135', 136, 151-155, 
184-188, 194-195 and 198. 

44. A diagnostic kit comprising at least one oligonucleotide probe, the 
oligonucleotide probe being specific for a DNA molecule according to claim 5. 

45. A kit according to claim 44, wherein the oligonucleotide probe 
comprises at least about 15 contiguous nucleotides of a DNA molecule according to claim 5. 

46. A diagnostic kit comprising at least one oligonucleotide probe, the 
oligonucleotide probe being specific for a DNA sequence selected from the group consisting 
of SEQIDNOS: 3, 11, 12, 135, 136, 151-155, 184-188, 194-195 and 198. 

47. A kit according to claim 46, wherein the oligonucleotide probe 
comprises at least about 15 contiguous nucleotides of a DNA sequence selected from the 
group consisting of SEQIDNOS: 3, 11, 12, 135, 136, 151-155, 184-188, 194-195 and 198. 

48. A monoclonal antibody that binds to a polypeptide according to any of 

claims 1-4. 
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49. A polyclonal antibody that binds to a polypeptide according to any of 

claims 1-4. 

50. A fusion protein comprising two or more polypeptides according to 
any one of claims 1-4. 

51. A fusion protein comprising one or more polypeptides according to 
any one of claims 1-4 and ESAT-6 (SEQ ID NO: 99). 

52. A fusion protein comprising a polypeptide having an N-terminal 
sequence selected from the group of sequences provided in SEQ ID NOS: 129; and 130. 

53. A fusion protein comprising one or more polypeptides according to 
any one of claims 1-4 and the M tuberculosis antigen 38 kD (SEQ ID NO: 150). 

54. A diagnostic kit comprising: 

(a) one or more fusion proteins according to any one of claims 50-53; and 

(b) a detection reagent. 
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PCT/US 97/18214 



B x I Observati ns where ertain claims were found unsearchabl (Continuation 1 item 1 of first sheet) 



This International Search Report has not been established in respect of certain claims under Article I7(2)(a) for the following 



because they relate to parts of the International Application that do not comply with the prescribed requirements to such 
an extent that no meaningful International Search can be earned out, specifically: 



because they are dependent claims and are not drafted in accordance with the second and third sentences of Rule 6.4(a). 



Box II Observations where unity of invention is lacking (Continuation of item 2 of first sheet) 



This International Searching Authority found multiple inventions in this international application, as follows: 



see continuation-sheet 



□ As all required additional search fees were timely paid by the applicant, this international Search Report covers 
searchable claims. 



As all searchable claims could be searched without effort justifying an additional fee, this Authonty did not invite payment 
of any additional fee. 



□ As only some of the required additional search fees were timely paid by the applicant, this International Search Report 
covers only those claims for which fees were paid, specifically claims Nos.: 



4 I X I No required additional search fees were timely paid by the applicant. Consequently, this International Search Report is 

1 ' restricted to the invention first mentioned in the claims; it is covered by claims Nos.: 



1,3,5-9,12-18,21-23,26,27,30-32,35-41,44,45,48-51,53,54 all partially 
(subject 1. on next sheet) 




because they relate to subject matter not required to be searched by this Authority, namely: 







Remark on Protest 





No protest accompanied the payment of additional search fees. 
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1. Claims: 1, 3, 5-9, 12-18, 21-23, 26, 27, 30-32, 35-41, 44, 
45, 48-51, 53, 54 all partially. 



A polypeptide comprising an antigenic portion of a soluble 
M. tuberculosis antigen or a variant, having an N-terminal 
aminoacid sequence as in Seq.ID:115 and/or encoded by a DNA 
molecule as in Seq.ID:96, complements of said sequence or 
sequences hybridizing to it. A DNA molecule comprising a 
sequence encoding said polypeptide. An expression vector 
comprising said DNA molecule, a host cell transformed with 
said expression vector. A method for detecting M. 
tuberculosis infection in a biological sample by detection 
of antibodies binding to said polypeptide or by detection of 
said polypeptide. A method for detecting M. tuberculosis 
infection in a biological sample by detection of said DNA 
sequence. Diagnostic kits thereof. An antibody binding to 
said polypeptide. A fusion protein comprising said 
polypeptide. Diagnostic kit comprising said fusion protein. 



2. Claims: 1, 5-9, 12-18, 21-23, 26, 27, 30-32, 35-41, 44, 45, 
48-51, 53, 54 all partially. 



Same as invention 1 but for Seq.ID:116. 



3. Claims: 1, 3, 5-9, 12-18, 21-23, 26, 27, 30-32, 35-41, 44, 
45, 48-51, 53, 54 all partially. 



Same as invention 1 but for Seq.ID:(l)17 and 25. 



4. Claims: 1, 3, 5-9, 12-18, 21-23, 26, 27, 30-32, 35-41, 44, 
45, 48-51, 53, 54 all partially. 



Same as invention 1 but for Seq.ID:118 and 24. 



5. Claims: 1, 5-9, 12-18, 21-23, 26, 27, 30-32, 35-41, 44, 45, 
48-51, 53, 54 all partially. 



Same as invention 1 but for Seq.ID:119. 



6. Claims: 1, 5-9, 12-18, 21-23, 26, 27, 30-32, 35-41, 44, 45, 
48-51, 53, 54 all partially. 



Same as invention 1 but. for Seq.ID:120. 
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7. Claims: 1, 3, 5-9, 12-18, 21-23, 26, 27, 30-32, 35-41, 44, 

45, 48-51, 53, 54 all partially. 

Same as invention 1 but for Seq.ID:121 and 52. 

8. Claims: 1. 5-9, 12-18, 21-23, 26, 27. 30-32. 35-41, 44, 45, 

48-51, 53, 54 all partially. 

Same as invention 1 but for Seq.ID:122. 

9. Claims: 1, 3, 5-9, 12-18, 21-23, 26, 27 30-32, 35-41, 44, 

45, 48-51, 53, 54 all partially. 

Same as invention 1 but for Seq.ID:123 and 94. ^ 

10. Claims: 1. 5-9, 12-18. 21-23, 26, 27, 30-32, 35-41. 44, 45, 

48-51, 53, 54 all partially. 

Same as invention 1 but for Seq.ID:131. 

11. Claims: 2, 5-9, 12-18. 21-23. 26. 27. 30-32. 35-41, 44, 45, 

48-51, 53, 54 all partially. 

Same as invention 1 but for Seq.ID:124. 

12. Claims: 2, 5-9, 12-18, 21-23, 26, 27, 30-32, 35-41. 44, 45, 

48-51, 53, 54 all partially. 

Same as invention 1 but for Seq.ID:132. 

13. Claims: 3, 5-9. 12-18. 21-23, 26, 27, 30-32, 35-41, 44. 45. 

48-51, 53, 54 all partially. 

Same as invention 1 but for Seq.IDil. 

14. Claims: 3, 5-9, 12-18, 21-23, 26, 27. 30-32, 35-41, 44, 45, 

48-51, 53, 54 all partially. 

Same as invention 1 but for Seq.ID:2. 
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15. Claims: 3, 5-9, 12-18, 21-23, 26, 27, 3G-32, 35-41, 44, 45, 
48-51, 53, 54 all partially. 

Same as invention 1 but for Seq.lD:4 and 17. 

15. Claims: 3, 5-9, 12-18, 21-23, 25, 27, 30-32, 35-41, 44, 45, 
48-51, 53, 54 all partially. 

Same as invention 1 but for Seq.ID:5. 

17. Claims: 3, 5-9, 12-18, 21-23, 25, 27, 30-32, 35-41, 44, 45,/ 

48-51, 53, 54 all partially. jj 

Same as invention 1 but for Seq.ID:6. 

18. Claims: 3, 5-9, 12-18, 21-23, 26, 27, 30-32, 35-41, 44, 45, 

48-51. 53, 54 all partially. 

Same as invention 1 but for Seq.ID:7. 

19. Claims: 3, 5-9, 12-18, 21-23, 26, 27, 30-32, 35-41, 44, 45, 

48-51, 53, 54 all partially. 

Same as invention 1 but for Seq.ID:8. 

20. Claims: 3, 5-9, 12-18, 21-23, 26, 27, 30-32, 35-41, 44, 45, 

48-51, 53, 54 all partially. 

Same as invention 1 but for Seq.ID:9. 

21. Claims: 3, 5-9, 12-18, 21-23, 26, 27, 30-32, 35-41, 44, 45, 

48-51, 53, 54 all partially. 

Same as invention 1 but for Seq.ID:10 and 13. 

22. Claims: 3, 5-9, 12-18, 21-23, 26, 27, 30-32, 35-41, 44, 45, 

48-51, 53, 54 all partially. 

Same as invention 1 but for Seq.ID:14. 
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23. Claims: 3, 5-9, 12-18, 21-23, 26, 27, 30-32, 35-41, 44, 45, 

48-51, 53, 54 all partially. 

Same as invention 1 but for Seq.ID:15. 

24. Claims: 3, 5-9, 12-18, 21-23, 26, 27, 30-32, 35-41, 44, 45, 

48-51, 53, 54 all partially. 

Same as invention 1 but for Seq.ID:16. 

25. Claims: 3, 5-9, 12-18, 21-23, 26, 27, 30-32, 35-41, 44, 45, 

48-51, 53, 54 all partially. 



Same as invention 1 but for Seq.ID:18. 

26. Claims: 3, 5-9, 12-18, 21-23, 26, 27, 30-32, 35-41, 44, 45, 

48-51, 53, 54 all partially. 

Same as invention 1 but for Seq. 10:19. 

27. Claims: 3, 5-9, 12-18, 21-23, 26, 27, 30-32, 35-41, 44, 45, 

48-51, 53, 54 all partially. 

Same as invention 1 but for Seq. 10:20. 

28. Claims: 3, 5-9, 12-18, 21-23, 26, 27,. 30-32, 35-41, 44, 45, 

48-51, 53, 54 all partially. 

Same as invention 1 but for Seq.ID:21. 

29. Claims: 3, 5-9, 12-18, 21-23, 26, 27, 30-32, 35-41, 44, 45, 

48-51, 53, 54 all partially. 

Same as invention 1 but for Seq.ID:22. 

30. Claims: 3, 5-9, 12-18, 21-23, 26, 27, 30-32, 35-41, 44, 45, 

48-51, 53, 54 all partially. 

Same as invention 1 but -for Seq.ID:23. 
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31. Claims: 4-9, 12-18, 21-23, 26, 27, 30-32, 35-41, 44, 45, 

48-51, 53, 54 all partially. 

Same as invention 1 but for Seq.ID:26. 

32. Claims: 4-9, 12-18, 21-23, 26, 27, 30-32, 35-41. 44, 45, 

48-51. 53, 54 all partially. 

Same as invention 1 but for Seq. 10:27. 

33. Claims: 4-9, 12-18, 21-23, 26, 27, 30-32, 35-41, 44, 45' 

48-51, 53, 54 all partially. / 

Same as invention 1 but for Seq. 10:28. 

34. Claims: 4-9, 12-18, 21-23 26, 27. 30-32, 35-41, 44, 45, 

48-51, 53, 54 all partially. 

Same as invention 1 but for Seq.ID:29. 

35. Claims: 4-9, 12-18, 21-23 26, 27, 30-32, 35-41, 44, 45, 

48-51, 53, 54 all partially. 

Same as invention 1 but for Seq. 10:30. 

36. Claims: 4-9, 12-18, 21-23, 26, 27, 30-32, 35-41, 44, 45, 

48-51, 53, 54 all partially. 

Same as invention 1 but for Seq.ID:31. 

37. Claims: 4-9, 12-18, 21-23, 26, 27 30-32, 35-41, 44, 45, 

48-51, 53, 54 all partially. 

Same as invention 1 but for Seq.ID:32. 

38. Claims: 4-9, 12-18, 21-23 26, 27, 30-32. 35-41. 44, 45, 

48-51, 53, 54 all partially. 

Same as invention 1 but for Seq.ID:33. 
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39. Claims: 4-9, 12-18, 21-23, 26. 27, 30-32, 35-41, 44, 45, 

48-51, 53, 54 all partially. 

Same as invention 1 but for Seq.ID:34. 

40. Claims: 4-9, 12-18, 21-23, 26, 27, 30-32, 35-41, 44, 45, 

48-51, 53, 54 all partially. 

Same as invention 1 but for Seq.ID:35. 

41. Claims: 4-9, 12-18, 21-23, 26, 27, 30-32, 35-41, 44, 45, 

48-51, 53, 54 all partially. 

Same as invention 1 but for Seq.ID:36. 

42. Claims: 4-9, 12-18, 21-23, 26, 27, 30-32, 35-41, 44, 45, 

48-51, 53, 54 all partially. 

Same as invention 1 but for Seq. 10:37. 

43. Claims: 4-9. 12-18, 21-23, 26, 27, 30-32. 35-41. 44, 45, 

48-51, 53, 54 all partially. 

Same as invention 1 but for Seq. ID: 38. 

44. Claims: 4-9, 12-18, 21-23, 26, 27, 30-32, 35-41. 44, 45, 

48-51, 53, 54 all partially. 

Same as invention 1 but for Seq. 10:39. 

45. Claims: 4-9. 12-18, 21-23, 26, 27, 30-32, 35-41, 44, 45, 

48-51, 53, 54 all partially. 

Same as invention 1 but for Seq. ID: 40. 

46. Claims: 4-9, 12-18, 21-23, 26, 27, 30-32, 35-41, 44, 45, 

48-51, 53, 54 all partially. 

Same as invention 1 but for Seq.ID:41. 
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47. Claims: 4-9, 12-18, 21-23, 26, 27, 30-32, 35-41, 44, 45, 
48-51, 53, 54 all partially. 



Same as invention 1 but for Seq.ID:42. 

48. Claims: 4-9, 12-18, 21-23, 26, 27, 30-32, 35-41, 44, 45, 

48-51, 53, 54 all partially. 

Same as invention 1 but for Seq.ID:43, 44 and 178. 

49. Claims: 4-9, 12-18, 21-23, 26, 27, 30-32, 35-41, 44, 45, / 

48-51, 53, 54 all partially. 

Same as invention 1 but for Seq.ID:45. 

50. Claims: 4-9, 12-18, 21-23, 26, 27, 30-32, 35-41, 44, 45, 

48-51, 53, 54 all partially. 

Same as invention 1 but for Seq.ID:46. 

51. Claims: 4-9, 12-18, 21-23, 26, 27, 30-32, 35-41, 44, 45, 

48-51, 53, 54 all partially. 

Same as invention 1 but for Seq.ID:47. 

52. Claims: 4-9, 12-18, 21-23, 26, 27, 30-32, 35-41, 44, 45, 

48-51, 53, 54 all partially. 

Same as invention 1 but for Seq.ID:48. 

53. Claims: 4-9, 12-18, 21-23, 26, 27, 30-32, 35-41, 44, 45, 

48-51, 53, 54 all partially. 



Same as invention 1 but for Seq. 10:49. 



54. Claims: 4-9, 12-18, 21-23, 26, 27, 30-32, 35-41, 44, 45, 
48-51, 53, 54 all partially. 



Same as invention .1 but for Seq.ID:50. 
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55. Claims: 4-9, 12-18, 21-23, 26, 27, 30-32, 35-41, 44, 45, 

48-51, 53, 54 all partially. 

Same as invention 1 but for Seq.ID:51. 

56. Claims: 4-9, 12-18, 21-23, 26, 27, 30-32, 35-41, 44, 45, 

48-51, 53, 54 all partially. 

Same as invention 1 but for Seq.lD:133. 

57. Claims: 4-9, 12-18, 21-23, 26, 27, 30-32, 35-41, 44, 45, , 

48-51, 53, 54 all partially. 

Same as invention 1 but for Seq.ID:134. 

58. Claims: 4-9, 12-18, 21-23, 26, 27, 30-32, 35-41, 44, 45, 

48-51, 53, 54 all partially. 

Same as invention 1 but for Seq.ID:158. 

59. Claims: 4-9, 12-18, 21-23, 26, 27, 30-32, 35-41, 44, 45, 

48-51, 53, 54 all partially. 

Same as invention 1 but for Seq. 10:159. 

60. Claims: 4-9, 12-18, 21-23, 26, 27, 30-32, 35-41, 44, 45, 

48-51, 53, 54 all partially." 

Same as invention 1 but for Seq. 10:160. 

61. Claims: 4-9, 12-18, 21-23, 26, 27, 30-32, 35-41, 44, 45, 

48-51, 53, 54 all partially. 

Same as invention 1 but for Seq. ID: 161. 

62. Claims: 4-9, 12-18, 21-23, 26, 27, 30-32, 35-41, 44, 45, 

48-51, 53, 54 all partially. 



Same as invention 1 but for Seq.ID:162. 
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63. Claims: 4-9, 12-18, 21-23, 26, 27, 30-32, 35-41, 44, 45, 
48-51, 53, 54 all partially. 

Same as invention 1 but for Seq.ID:163. 



64. Claims: 4-9, 12-18, 21-23, 26, 27, 30-32, 35-41, 44, 45, 
48-51, 53, 54 all partially. 



Same as invention 1 but for Seq.ID:164 and 165. 



65. Claims: 4-9, 12-18, 21-23, 26, 27, 30-32, 35-41, 44, 45, 
48-51, 53, 54 all partially. 



Same as invention 1 but for Seq.ID:166 and 167. 



66. Claims: 4-9, 12-18, 21-23, 26, 27, 30-32, 35-41, 44, 45, 
48-51, 53, 54 all partially. 



Same as invention 1 but for Seq, 10:168 and 169. 



67. Claims: 4-9, 12-18, 21-23, 26, 27, 30-32, 35-41, 44, 45, 
48-51, 53, 54 all partially. 



Same as invention 1 but for Seq. ID: 170 and 171. 



68. Claims: 4-9, 12-18, 21-23, 26, 27, 30-32, 35-41, 44, 45, 
48-51, 53, 54 all partially. 



Same as invention 1 but for Seq. 10:172 and 173. 



69. Claims: 4-9, 12-18, 21-23, 26, 27, 30-32, 35-41, 44, 45, 
48-51, 53, 54 all partially. 



Same as invention 1 but for Seq. ID: 174 and 175. 



70. Claims: 4-9, 12-18, 21-23, 26, 27, 30-32, 35-41, 44, 45, 
48-51, 53, 54 all partially. 



Same as invention 1 but for Seq. ID: 176 and 177. 
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71. Claims: 4-9, 12-18, 21-23, 26, 27, 30-32, 35-41, 44,. 45, 

48-51, 53, 54 all partially. 

Same as invention 1 but for Seq.ID:196. 

72. Claims: 10, 12-16, 28, 30, 31, 33, 35-39, 52, 

54 all partially. 

A method for detecting M. tuberculosis infection in a 
biological sample by detection of antibodies binding to a 
polypeptide having an N-terminal sequence as in Seq.ID:129, 
or by detection of a protein or polypeptide that binds to an 
agent binding to a polypeptide having an N-terminal sequence 
as in Seq.ID:129. Diagnostic kits thereof. A fusion protein 
comprising said polypeptide. Diagnostic kit comprising said 
fusion protein. 

73. Claims: 10, 12-16, 28, 30, 31, 33, 35-39, 52, 

54 all partially. 

Same as invention 72 but for Seq.ID:130. 

74. Claims: 11-16, 19-21, 24-26, 29-31, 34-39, 42, 43, 46, 

47 all partially. 

A method for detecting M. tuberculosis infection in a 
biological sample by detection of antibodies binding to a 
polypeptide encoded by a DNA sequence consisting of 
Seq.ID:3, complements or hybridizing sequences. A method for 
detecting M. tuberculosis infection in a biological sample 
by detection of said DNA sequence. A method for detecting M. 
tuberculosis infection in a biological sample by detection 
of a protein or polypeptide that binds to an agent binding 
to a polypeptide encoded by Seq.ID:3, complements or 
hybridizing sequences. Diagnostic kits thereof. 

75. Claims: 11-16, 19-21, 24-26, 29-31, 34-39, 42, 43, 46, 

47 all partially. 



Same as invention 74 but for Seq.ID:ll. 



76. Claims: 11-16, 19-21, 24-26, 29-31, 34-39, 42, 43, 46, 
47 all partially. 



Same as invention 74 but for Seq.ID:12. 
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77. Claims: 11-16, 19-21, 24-26, 29-31, 34-39, 42, 43, 46, 

47 all partially. 

Same as invention 74 but for Seq.ID:135. 

78. Claims: 11-16, 19-21, 24-26, 29-31, 34-39, 42, 43, 46, 

47 all partially. 

Same as invention 74 but for Seq.ID:136. 

79. Claims: 11-16, 19-21, 24-26, 29-31, 34-39, 42, 43, 46, 

47 all partially. 

Same as invention 74 but for Seq.ID:151. 

80. Claims: 11-16, 19-21, 24-26, 29-31, 34-39, 42, 43, 46, 

47 all partially. 

Same as invention 74 but for Seq.ID:152. 

81. Claims: 11-16, 19-21, 24-26, 29-31, 34-39, 42, 43, 46, 

47 all partially. 

Same as invention 74 but for Seq.ID:153. 

82. Claims: 11-16, 19-21, 24-26, 29-31,- 34-39, 42, 43, 46, 

47 all partially. 

Same as invention 74 but for Seq.ID:154 and 155. 

83. Claims: 11-16, 19-21, 24-26, 29-31, 34-39, 42, 43, 46, 

47 all partially. 

Same as invention 74 but for Seq.ID:184. 

84. Claims: 11-16, 19-21, 24-26, 29-31, 34-39, 42. 43, 46, 

47 all partially. 

Same as invention 74 but for Seq.ID:185. 
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85. Claims: 11-16, 19-21, 24-26, 29-31, 34-39, 42, 43, 46, 
47 all partially. 



Same as invention 74 but for Seq.ID:186. 



86. Claims: 11-16, 19-21, 24-26, 29-31, 34-39, 42, 43, 46, 
47 all partially. 



Same as invention 74 but for Seq.ID:187. 



87. Claims: 11-16, 19-21, 24-26, 29-31, 34-39, 42, 43, 46, 
47 all partially. 



Same as* invention 74 but for Seq.ID:188. 



88. Claims: 11-16, 19-21, 24-26, 29-31, 34-39, 42, 43, 46, 
47 all partially. 



Same as invention 74 but for Seq.ID:194 and 195. 



89. Claims: 11-16, 19-21, 24-26, 29-31, 34-39, 42, 43, 46, 
47 all partially. 



Same as invention 74 but for Seq.ID:198. 
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